Prakash Ishwar

Boston University, Boston, Massachusetts, United States

Are you Prakash Ishwar?

Claim your profile

Publications (100)103.07 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new model for rank aggregation from pairwise comparisons that captures both ranking heterogeneity across users and ranking inconsistency for each user. We establish a formal statistical equivalence between the new model and topic models. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable statistical and computational efficiency guarantees. The method is also shown to empirically outperform competing approaches on some semi-synthetic and real-world datasets.
    12/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Change detection is one of the most commonly encountered low-level tasks in computer vision and video processing. A plethora of algorithms have been developed to date, yet no widely accepted, realistic, large-scale video dataset exists for benchmarking different methods. Presented here is a unique change detection video dataset consisting of nearly 90,000 frames in 31 video sequences representing 6 categories selected to cover a wide range of challenges in 2 modalities (color and thermal IR). A distinguishing characteristic of this benchmark video dataset is that each frame is meticulously annotated by hand for ground-truth foreground, background, and shadow area boundaries - an effort that goes much beyond a simple binary label denoting the presence of change. This enables objective and precise quantitative comparison and ranking of video-based change detection algorithms. This paper discusses various aspects of the new dataset, quantitative performance metrics used, and comparative results for over two dozen change detection algorithms. It draws important conclusions on solved and remaining issues in change detection, and describes future challenges for the scientific community. The dataset, evaluation tools, and algorithm rankings are available to the public on a website1 and will be updated with feedback from academia and industry in the future.
    IEEE Transactions on Image Processing 08/2014; · 3.20 Impact Factor
  • Limor Martin, W. Clem Karl, Prakash Ishwar
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a new structure-preserving dual-energy (SPDE) CT inversion technique for luggage screening, which can mitigate metal artifacts and provide precise object localization. Such artifact reduction can increase material identification accuracy in security applications. Our main objective is formation of enhanced photoelectric and Compton pixel property images from dual-energy X-ray tomographic data. We achieve this aim by incorporating three important elements in a single unified framework. First, we generate our images as the solution of a joint optimization problem, which explicitly models the projection process. Second, we include metal aware data weighting to reduce streaks and metal artifacts. Third, we estimate a regularized joint boundary field and apply it to both the photoelectric and Compton images in order to improve object localization as well as smoothing inside the objects. We evaluate the performance of the method using real dual-energy data. We demonstrate a significant reduction in noise and metal artifacts.
    ICASSP 2014 - 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); 05/2014
  • Source
    Ye Wang, Prakash Ishwar, Shantanu Rane
    [Show abstract] [Hide abstract]
    ABSTRACT: In the secure two-party sampling problem, two parties wish to generate outputs with a desired joint distribution via an interactive protocol, while ensuring that neither party learns more than what can be inferred from only their own output. For semi-honest parties and information-theoretic privacy guarantees, it is well-known that if only noiseless communication is available, then only the "trivial" joint distributions, for which common information equals mutual information, can be securely sampled. We consider the problem where the parties may also interact via a given set of general communication primitives (multi-input/output channels). Our feasibility characterization of this problem can be stated as a zero-one law: primitives are either complete (enabling the secure sampling of any distribution) or useless (only enabling the secure sampling of trivial distributions). Our characterization of the complete primitives also extends to the more general class of secure two-party computation problems.
    02/2014;
  • [Show abstract] [Hide abstract]
    ABSTRACT: X-ray Computed Tomography (CT) is an effective nondestructive technology used for security applications. In CT, three-dimensional images of the interior of an object are generated based on its X-ray attenuation. Multi-energy CT can be used to enhance material discrimination. Currently, reliable identification and segmentation of objects from CT data is challenging due to the large range of materials which may appear in baggage and the presence of metal and high clutter. Conventionally reconstructed CT images suffer from metal induced streaks and artifacts which can lead to breaking of objects and inaccurate object labeling. We propose a novel learning-based framework for joint metal artifact reduction and direct object labeling from CT derived data. A material label image is directly estimated from measured effective attenuation images. We include data weighting to mitigate metal artifacts and incorporate an object boundary-field to reduce object splitting. The overall problem is posed as a graph optimization problem and solved using an efficient graphcut algorithm. We test the method on real data and show that it can produce accurate material labels in the presence of metal and clutter.
    SPIE Computational Imaging XII, San Francisco, California, USA; 02/2014
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a novel approach for designing kernels for support vector machines (SVMs) when the class label is linked to the observation through a latent state and the likelihood function of the observation given the state (the sensing model) is available. We show that the Bayes-optimum decision boundary is a hyperplane under a mapping defined by the likelihood function. Combining this with the maximum margin principle yields kernels for SVMs that leverage knowledge of the sensing model in an optimal way. We derive the optimum kernel for the bag-of-words (BoWs) sensing model and demonstrate its superior performance over other kernels in document and image classification tasks. These results indicate that such optimum sensing-aware kernel SVMs can match the performance of rather sophisticated state-of-the-art approaches.
    12/2013;
  • Source
    Ye Wang, Shantanu Rane, Prakash Ishwar
    [Show abstract] [Hide abstract]
    ABSTRACT: In game theory, a trusted mediator acting on behalf of the players can enable the attainment of correlated equilibria, which may provide better payoffs than those available from the Nash equilibria alone. We explore the approach of replacing the trusted mediator with an unconditionally secure sampling protocol that jointly generates the players' actions. We characterize the joint distributions that can be securely sampled by malicious players via protocols using error-free communication. This class of distributions depends on whether players may speak simultaneously ("cheap talk") or must speak in turn ("polite talk"). In applying sampling protocols toward attaining correlated equilibria with rational players, we observe that security against malicious parties may be much stronger than necessary. We propose the concept of secure sampling by rational players, and show that many more distributions are feasible given certain utility functions. However, the payoffs attainable via secure sampling by malicious players are a dominant subset of the rationally attainable payoffs.
    11/2013;
  • [Show abstract] [Hide abstract]
    ABSTRACT: We consider a novel problem of endmember detection in hyperspectral imagery where signal of frequency bands are probed sequentially. We propose an adaptive strategy in controlling the sensing order to maximize the normalized solid angle as a robustness measure of the problem geometry. This is based on efficiently identifying pure pixels that are unique to each endmember and exploiting information from a spectral library known in advance though sequential random projections. We present simulations on synthetic datasets to demonstrate the merits of our scheme in reducing the observation cost.
    2013 Asilomar Conference on Signals, Systems and Computers; 11/2013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The simplicial condition and other stronger conditions that imply it have recently played a central role in developing polynomial time algorithms with provable asymptotic consistency and sample complexity guarantees for topic estimation in separable topic models. Of these algorithms, those that rely solely on the simplicial condition are impractical while the practical ones need stronger conditions. In this paper, we demonstrate, for the first time, that the simplicial condition is a fundamental, algorithm-independent, information-theoretic necessary condition for consistent separable topic estimation. Furthermore, under solely the simplicial condition, we present a practical quadratic-complexity algorithm based on random projections which consistently detects all novel words of all topics using only up to second-order empirical word moments. This algorithm is amenable to distributed implementation making it attractive for 'big-data' scenarios involving a network of large distributed databases.
    10/2013;
  • [Show abstract] [Hide abstract]
    ABSTRACT: Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. In order to close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, that typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images (image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.
    IEEE Transactions on Image Processing 06/2013; · 3.20 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: BIOMETRICS are an important and widely used class of methods for identity verification and access control. Biometrics are attractive because they are inherent properties of an individual. They need not be remembered like passwords, and are not easily lost or forged like identifying documents. At the same time, bio- metrics are fundamentally noisy and irreplaceable. There are always slight variations among the measurements of a given biometric, and, unlike passwords or identification numbers, biometrics are derived from physical characteristics that cannot easily be changed. The proliferation of biometric usage raises critical privacy and security concerns that, due to the noisy nature of biometrics, cannot be addressed using standard cryptographic methods. In this article we present an overview of "secure biometrics", also referred to as "biometric template protection", an emerging class of methods that address these concerns.
    IEEE Signal Processing Magazine 05/2013; · 3.37 Impact Factor
  • K Guo, P Ishwar, J Konrad
    [Show abstract] [Hide abstract]
    ABSTRACT: We propose a general framework for fast and accurate recognition of actions in video using empirical covariance matrices of features. A dense set of spatio-temporal feature vectors are computed from video to provide a localized description of the action, and subsequently aggregated in an empirical covariance matrix to compactly represent the action. Two supervised learning methods for action recognition are developed using feature covariance matrices. Common to both methods is the transformation of the classification problem in the closed convex cone of covariance matrices into an equivalent problem in the vector space of symmetric matrices via the matrix logarithm. The first method applies nearestneighbor classification using a suitable Riemannian metric for covariance matrices. The second method approximates the logarithm of a query covariance matrix by a sparse linear combination of the logarithms of training covariance matrices. The action label is then determined from the sparse coefficients. Both methods achieve state-of-the-art classification performance on several datasets, and are robust to action variability, viewpoint changes, and low object resolution. The proposed framework is conceptually simple and has low storage and computational requirements making it attractive for real-time implementation.
    IEEE Transactions on Image Processing 03/2013; · 3.20 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: We present algorithms for topic modeling based on the geometry of cross-document word-frequency patterns. This perspective gains significance under the so called separability condition. This is a condition on existence of novel-words that are unique to each topic. We present a suite of highly efficient algorithms based on data-dependent and random projections of word-frequency patterns to identify novel words and associated topics. We will also discuss the statistical guarantees of the data-dependent projections method based on two mild assumptions on the prior density of topic document matrix. Our key insight here is that the maximum and minimum values of cross-document frequency patterns projected along any direction are associated with novel words. While our sample complexity bounds for topic recovery are similar to the state-of-art, the computational complexity of our random projection scheme scales linearly with the number of documents and the number of words per document. We present several experiments on synthetic and real-world datasets to demonstrate qualitative and quantitative merits of our scheme.
    03/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: We study high-dimensional asymptotic performance limits of binary supervised classification problems where the class conditional densities are Gaussian with unknown means and covariances and the number of signal dimensions scales faster than the number of labeled training samples. We show that the Bayes error, namely the minimum attainable error probability with complete distributional knowledge and equally likely classes, can be arbitrarily close to zero and yet the limiting minimax error probability of every supervised learning algorithm is no better than a random coin toss. In contrast to related studies where the classification difficulty (Bayes error) is made to vanish, we hold it constant when taking high-dimensional limits. In contrast to VC-dimension based minimax lower bounds that consider the worst case error probability over all distributions that have a fixed Bayes error, our worst case is over the family of Gaussian distributions with constant Bayes error. We also show that a nontrivial asymptotic minimax error probability can only be attained for parametric subsets of zero measure (in a suitable measure space). These results expose the fundamental importance of prior knowledge and suggest that unless we impose strong structural constraints, such as sparsity, on the parametric space, supervised learning may be ineffective in high dimensional small sample settings.
    01/2013;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A new geometrically-motivated algorithm for nonnegative matrix factorization is developed and applied to the discovery of latent "topics" for text and image "document" corpora. The algorithm is based on robustly finding and clustering extreme points of empirical cross-document word-frequencies that correspond to novel "words" unique to each topic. In contrast to related approaches that are based on solving non-convex optimization problems using suboptimal approximations, locally-optimal methods, or heuristics, the new algorithm is convex, has polynomial complexity, and has competitive qualitative and quantitative performance compared to the current state-of-the-art approaches on synthetic and real-world datasets.
    Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on 01/2013; · 4.63 Impact Factor
  • J. Wu, J. Konrad, P. Ishwar
    [Show abstract] [Hide abstract]
    ABSTRACT: The Kinect has primarily been used as a gesture-driven device for motion-based controls. To date, Kinect-based research has predominantly focused on improving tracking and gesture recognition across a wide base of users. In this paper, we propose to use the Kinect for biometrics; rather than accommodating a wide range of users we exploit each user's uniqueness in terms of gestures. Unlike pure biometrics, such as iris scanners, face detectors, and fingerprint recognition which depend on irrevocable biometric data, the Kinect can provide additional revocable gesture information. We propose a dynamic time-warping (DTW) based framework applied to the Kinect's skeletal information for user access control. Our approach is validated in two scenarios: user identification, and user authentication on a dataset of 20 individuals performing 8 unique gestures. We obtain an overall 4.14%, and 1.89% Equal Error Rate (EER) in user identification, and user authentication, respectively, for a gesture and consistently outperform related work on this dataset. Given the natural noise present in the real-time depth sensor this yields promising results.
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on; 01/2013
  • [Show abstract] [Hide abstract]
    ABSTRACT: Searching for images on-line using keywords returns results that are often difficult to interpret. This becomes even more complicated if one attempts to compare image search output for several keywords with a common theme. We focus on the latter problem and propose a method to efficiently compare sets of images in order to find representative images, one from each set, that are coherent in certain sense. However, the search for an optimal set of representative images is very complex even for as few as 10 sets of 20 images each since all possible combinations of 10 images need to be considered. Therefore, we formulate our problem as the Generalized Traveling Salesman Problem (GTSP) and propose an efficient approximation algorithm to solve it. Our approximate GTSP algorithm is faster than other well-known approximations and is also more likely to reach the exact solution for large-scale inputs. We present a number of experimental results using the proposed algorithm and conclude that it can be a useful, almost real-time tool for on-line search.
    Proceedings of the 20th ACM international conference on Multimedia; 10/2012
  • Source
    Ye Wang, Prakash Ishwar, Shantanu Rane
    [Show abstract] [Hide abstract]
    ABSTRACT: The problem in which one of three pairwise interacting parties is required to securely compute a function of the inputs held by the other two, when one party may arbitrarily deviate from the computation protocol (active behavioral model), is studied. An information-theoretic characterization of unconditionally secure computation protocols under the active behavioral model is provided. A protocol for Hamming distance computation is provided and shown to be unconditionally secure under both active and passive behavioral models using the information-theoretic characterization. The difference between the notions of security under the active and passive behavioral models is illustrated through the BGW protocol for computing quadratic and Hamming distances; this protocol is secure under the passive model, but is shown to be not secure under the active model.
    06/2012;
  • [Show abstract] [Hide abstract]
    ABSTRACT: The availability of 3D hardware has so far outpaced the production of 3D content. Although to date many methods have been proposed to convert 2D images to 3D stereopairs, the most successful ones involve human operators and, therefore, are time-consuming and costly, while the fully-automatic ones have not yet achieved the same level of quality. This subpar performance is due to the fact that automatic methods usually rely on assumptions about the captured 3D scene that are often violated in practice. In this paper, we explore a radically different approach inspired by our work on saliency detection in images. Instead of relying on a deterministic scene model for the input 2D image, we propose to "learn" the model from a large dictionary of stereopairs, such as YouTube 3D. Our new approach is built upon a key observation and an assumption. The key observation is that among millions of stereopairs available on-line, there likely exist many stereopairs whose 3D content matches that of the 2D input (query). We assume that two stereopairs whose left images are photometrically similar are likely to have similar disparity fields. Our approach first finds a number of on-line stereopairs whose left image is a close photometric match to the 2D query and then extracts depth information from these stereopairs. Since disparities for the selected stereopairs differ due to differences in underlying image content, level of noise, distortions, etc., we combine them by using the median. We apply the resulting median disparity field to the 2D query to obtain the corresponding right image, while handling occlusions and newly-exposed areas in the usual way. We have applied our method in two scenarios. First, we used YouTube 3D videos in search of the most similar frames. Then, we repeated the experiments on a small, but carefully-selected, dictionary of stereopairs closely matching the query. This, to a degree, emulates the results one would expect from the use of an extremely large 3D repository. While far from perfect, the presented results demonstrate that on-line repositories of 3D content can be used for effective 2D-to-3D image conversion. With the continuously increasing amount of 3D data on-line and with the rapidly growing computing power in the cloud, the proposed framework seems a promising alternative to operator-assisted 2D-to-3D conversion.
    Proc SPIE 02/2012;
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Change detection is one of the most commonly encoun-tered low-level tasks in computer vision and video process-ing. A plethora of algorithms have been developed to date, yet no widely accepted, realistic, large-scale video dataset exists for benchmarking different methods. Presented here is a unique change detection benchmark dataset consisting of nearly 90,000 frames in 31 video sequences representing 6 categories selected to cover a wide range of challenges in 2 modalities (color and thermal IR). A distinguishing characteristic of this dataset is that each frame is meticu-lously annotated for ground-truth foreground, background, and shadow area boundaries – an effort that goes much be-yond a simple binary label denoting the presence of change. This enables objective and precise quantitative comparison and ranking of change detection algorithms. This paper presents and discusses various aspects of the new dataset, quantitative performance metrics used, and comparative re-sults for over a dozen previous and new change detection algorithms. The dataset, evaluation tools, and algorithm rankings are available to the public on a website 1 and will be updated with feedback from academia and industry in the future.
    CVPR - Change Detection Workshop; 01/2012

Publication Stats

785 Citations
103.07 Total Impact Points

Institutions

  • 2005–2012
    • Boston University
      • Department of Electrical and Computer Engineering
      Boston, Massachusetts, United States
  • 2007–2010
    • Indian Institute of Technology Bombay
      • Department of Electrical Engineering
      Mumbai, State of Maharashtra, India
  • 2009
    • University of California, San Diego
      • Department of Electrical and Computer Engineering
      San Diego, CA, United States
  • 2002–2008
    • University of California, Berkeley
      • Department of Electrical Engineering and Computer Sciences
      Berkeley, MO, United States
  • 1997–2005
    • University of Illinois, Urbana-Champaign
      • • Department of Electrical and Computer Engineering
      • • Beckman Institute for Advanced Science and Technology
      Urbana, Illinois, United States