About
401
Publications
138,879
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
11,140
Citations
Additional affiliations
January 2002 - December 2006
January 1970 - December 2012
Publications
Publications (401)
Cognitive computing is a nascent interdisciplinary domain. It is a confluence of cognitive science, neuroscience, data science, and cloud computing. Cognitive science is the study of mind and offers theories, mathematical and computational models of human cognition. Cognitive science itself is an interdisciplinary domain and draws upon philosophy,...
We propose a novel cognitive biometrics modality based on written language-usage of an individual. This is a feasibility study using Internet-scale blogs, with tens of thousands of authors to create a cognitive fingerprint for an individual. Existing cognitive biometric modalities involve learning from obtrusive sensors placed on human body. Our mo...
Training non-linear neural networks is a challenging task, but over the years, various approaches coming from different perspectives have been proposed to improve performance. However, insights into what fundamentally constitutes \textit{optimal} network parameters remains obscure. Similarly, given what properties of data can we hope for a non-line...
While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- \textit{Internal Covariate Shift}-- the current solution has multiple drawbacks. For instance, BN depends on batch statistics for layerwise input normalization during training which makes the estimates of mean and standard de...
In this chapter, we discuss a novel biometric trait, called cognitive biometrics. It is defined as the process of identifying an individual through extracting and matching unique signatures based on the cognitive, affective, and conative state of that individual. Currently, there is an increasing need for novel biometric systems that engage multipl...
This chapter presents a concept paper that describes methods to accelerate new materials discovery and optimization, by enabling faster recognition and use of important theoretical, computational, and experimental information aggregated from peer-reviewed and published materials-related scientific documents online. To obtain insights for the discov...
In this paper we present a framework for secure identification using deep
neural networks, and apply it to the task of template protection for face
authentication. We use deep convolutional neural networks (CNNs) to learn a
mapping from face images to maximum entropy binary (MEB) codes. The mapping is
robust enough to tackle the problem of exact ma...
This paper describes a comparison between online handwritten cursive word recognition using segmentation-free method and that using segmentation-based method. To search the optimal segmentation and recognition path as the recognition result, we attempt two methods: segmentation-free and segmentation-based, where we expand the search space using a c...
Dissemination of malicious code, also known as malware, poses severe challenges to cyber security. Malware authors embed software in seemingly innocuous executables, unknown to a user. The malware subsequently interacts with security-critical OS resources on the host system or network, in order to destroy their information or to gather sensitive in...
While the term Big Data is open to varying interpretation, it is quite clear that the Volume, Velocity, and Variety (3Vs) of data have impacted every aspect of computational science and its applications. The volume of data is increasing at a phenomenal rate and a majority of it is unstructured. With big data, the volume is so large that processing...
Security is an important aspect in the practical deployment of biometric authentication systems. Biometric data in its original form is irreplaceable and thus, must be protected. This often comes at the cost of reduced matching accuracy or loss of the true key-less convenience biometric authentication can offer. In this paper, we address the shortc...
In this paper we present Deep Secure Encoding: a framework for secure
classification using deep neural networks, and apply it to the task of
biometric template protection for faces. Using deep convolutional neural
networks (CNNs), we learn a robust mapping of face classes to high entropy
secure codes. These secure codes are then hashed using standa...
Do today's communication technologies hold potential to alleviate poverty?
The mobile phone's accessibility and use allows us with an unprecedented volume
of data on social interactions, mobility and more. Can this data help us better
understand, characterize and alleviate poverty in one of the poorest nations in
the world. Our study is an attempt...
Although a number of auto-encoder models enforce sparsity explicitly in their
learned representation while others don't, there has been little formal
analysis on what encourages sparsity in these models in general. Therefore, our
objective here is to formally study this general problem for regularized
auto-encoders. We show that both regularization...
In our previous study, we presented the idea of verifying minutiae based fingerprint match by local correlation methods, where both the global minutiae distribution structure and the local matching similarity between the two fingerprints are considered. In this paper, we conduct two algorithms to enhance the evaluation of local correlation score, t...
We introduce a novel application of handwriting recognition for Statistical Relational Learning. The proposed framework captures the intrinsic structure of handwriting by modeling fundamental character shape representations and their relationships using first-order logic. Our framework consists of three stages, (1) character extraction (2) feature...
This paper investigates whether the cognitive state of a person can be learnt and used as a novel biometric trait. We explore the idea of using language written by an author, as his/her cognitive fingerprint. The dataset consists of millions of blogs written by thousands of authors on the internet. Our proposed method learns a classifier that can d...
We propose a script independent bayesian framework for keyword spotting in multilingual handwritten documents. The approach relies on local character level score and global word level hypothesis scores and learns a bayesian logistic regression classifier to distinguish between keywords and non-keywords. In a bayesian formulation of logistic regress...
Modeling data as being sampled from a union of independent subspaces has been
widely applied to a number of real world applications. However, dimensionality
reduction approaches that theoretically preserve this independence assumption
have not been well studied. Our key contribution is to show that $2K$
projection vectors are sufficient for the ind...
Popular CAPTCHA systems consist of garbled printed text character images with significant distortions and noise. It is believed that humans have little difficulty in deciphering the text, whereas automated systems are foiled by the added noise and distortion. However, in recent years, several text based CAPTCHAs have been reported as broken, that i...
A key factor in building effective writer identification/verification systems is the amount of data required to build the underlying models. In this research we systematically examine data sufficiency bounds for two broad approaches to online writer identification - feature space models vs. writer-style space models. We report results from 40 exper...
We propose the Bayesian Active Learning by Disagreement (BALD) model for keyword spotting in handwritten documents. In the context of keyword spotting in handwritten documents, the background text is all regions in the document that do not contain the keywords. The model tries to learn certain characteristics of the keyword and background text in a...
In this work we evaluate the performance of generic local structures as template points for secure fingerprint matching. We present a generic template structure called an n-gon that derives from a set of n neighboring minutiae points. We secure templates consisting of sets of n-gons using the fuzzy vault construct to obfuscate the data. We report t...
In order to fulfill the potential of fingerprint templates as the basis for authentication schemes, one needs to design a hash function for fingerprints that achieves acceptable matching accuracy and simultaneously has provable security guarantees, especially for parameter regimes that are needed to match fingerprints in practice. While existing ma...
Writer Identification can be seen as a multi-class learning problem where number of writers are different classes. One of the fundamental approaches to solve a multi-class problemis by breaking it into binary classification tasks. In this work weare proposing a generic approach for multi-class classification using an ensemble of binary classifiers....
Writer identification is the process of determining the author of a handwritten specimen by utilizing characteristics inherent in the sample. In this work, we apply the concept of accents in handwriting to introduce a novel perspective for writer identification. Analogous to speech, accents in handwriting can be defined as distinctive writing quirk...
We propose a statistical script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. We propose and compare an exhaustive study of filler models and background models for better representation of background or non-keyword text. The candidate keywords are pruned in a two stage spotting frame...
Modelling data as being sampled from a union of independent or disjoint
subspaces has been widely applied to a number of real world applications.
Recently, high dimensional data has come into focus because of advancements in
computational power and storage capacity. However, a number of algorithms that
assume the aforementioned data model have high...
The handprinted texts are produced when the writer tries to emulate some standard printed representation of the characters with the goal to make the written texts legible. Postal address blocks, different fillable forms, or other documents are among the examples of handprinting. Current chapter reviews the main techniques in recognizing handprinted...
Modeling data as being sampled from a union of independent or disjoint subspaces has been widely applied to a number of real world applications. Recently, high dimensional data has come into focus because of advancements in computational power and storage capacity. However, a number of algorithms that assume the aforementioned data model have high...
The matching system can be defined as a type of classifier which calculates the confidence score for each class separately from other classes. Biometric systems are one example of the matching systems. In this chapter we discuss the score fusion methods which are suitable for such systems. In particular, we describe the complexity types of combinat...
High-level, or holistic, scene understanding involves reasoning about objects, regions, the 3D relationships between them, etc. Scene labeling underlies many of these problems in computer vision. Reasoning about scene images requires the decomposition into semantically meaningful regions over which a graphical model can be imposed. Typically, repre...
We propose a bayesian framework for keyword spotting in handwritten documents. This work is an extension to our previous work where we proposed dynamic background model, DBM for keyword spotting that takes into account the local character level scores and global word level scores to learn a logistic regression classifier to separate keywords from n...
We present an end-to-end framework for outdoor scene region decomposition, learned on a small set of randomly selected images that generalizes well to multiple data sets containing images from around the world. We discuss the different aspects of the framework especially a generalized variational inference method with better approximations to the t...
Writer identification is a complex task as the handwriting of an individual encapsulates lot of information pertaining to text and personality of a writer. To learn a model to distinguish one writer from the other, it is important to capture every nuance of the handwriting of an individual. Learning such model poses two challenges. First, discrimin...
With the explosive growth of the tablet form factor and greater availability of pen-based direct input, online writer identification is increasingly becoming critical for person identification, digital forensics as well as downstream applications such as intelligent and adaptive user environments, search, indexing and retrieval of handwritten docum...
Matching score fusion is a commonly used technique for improving the performance of biometric systems. In this paper we investigate the methods for fusing the scores obtained from matching individual video frames to a stored face template. Traditional fusion rules like sum and product does not account for the diversity of information contained in c...
We propose a segmentation based online word recognition approach which uses a Conditional Random Field (CRF) driven beam search strategy. An efficient trie-lexicon directed, breadth-first beam search algorithm is employed in a combined segmentation-and-recognition framework to accomplish real-time recognition of online handwritten cursive English w...
This paper offers an overview of the current approaches to research in the field of off-line multilingual OCR. Typically, off-line OCR systems are designed for a particular script or language. However, the ideal approach to multilingual OCR would likely be to develop a system that can, with the use of language-specific training data, be re-targeted...
In this paper we present a new dual mode, twin-folio structured English handwriting dataset IBM_UB_1. IBM_UB_1 is our first major release from a large multilingual handwriting corpus. Containing over 6000 pages of handwritten matter, this dataset can not only be used for unconstrained handwriting recognition, more importantly, the dataset's unique...
This paper describes an online handwritten English cursive word recognition method using a segmentation-free Markov random field (MRF) model in combination with an offline recognition method which uses pseudo 2D bi-moment normalization (P2DBMN) and modified quadratic discriminant function (MQDF). It extracts feature points along the pen-tip trace f...
This paper describes a model based framework for detection and extraction of the contents of table cells from degraded handwritten document images that contain tables. Given the very poor quality of the target documents, the table cell detection problem is formulated conceptually as a two-step process. The first step is to identify the location of...
Accent in speech is defined as a distinctive mode of pronunciation that is unique to a geographical region. In a similar way, we define accent in handwriting as distinctive writing characteristics that are unique to a group of people sharing a common native script. In other words, we postulate that a group of people with a common native script will...
Cameras are becoming ubiquitous. Applications including video-based surveillance and emergency response exploit camera networks to detect anomalies in real time and reduce collateral damage. A well-known technique for detecting anomalies is spatio-temporal analysis -- an inferencing technique employed by domain experts (e.g., vision researchers) to...
A ‘smart space’ is one that automatically identifies and tracks its occupants using unobtrusive biometric modalities such as face, gait, and voice in an unconstrained fashion. Information retrieval in a smart space is concerned with the location and movement of people over time. Towards this end, we abstract a smart space by a probabilistic state t...
In this paper we investigate the question of combining multi-sample matching results obtained during repeated attempts of fingerprint based authentication. In order to utilize the information corresponding to multiple input templates in a most efficient way, we propose a minutiae-based matching state model which uses relationship between test templ...
This work, for the first time, combines fingerprint matching, security, and indexing in one system. We use the fuzzy vault construct with minutia path information to achieve this. Since we only store translation and rotation invariant information from the paths, we achieve pre-alignment for “free,” which is a requirement of the fuzzy vault. We intr...
We present language-motivated approaches to detecting, localizing and classifying activities and gestures in videos. In order to obtain statistical insight into the underlying patterns of motions in activities, we develop a dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classe...
In this work we place some of the traditional biometrics work on fingerprint verification via the fuzzy vault scheme within a cryptographic framework. We show that the breaking of a fuzzy vault leads to decoding of Reed-Solomon codes from random errors, which has been proposed as a hard problem in the cryptography community. We provide a security p...
We propose a segmentation free word spotting framework using Dynamic
Background Model. The proposed approach is an extension to our previous
work where dynamic background model was introduced and integrated with a
segmentation based recognizer for keyword spotting. The dynamic
background model uses the local character matching scores and global
wor...
A ‘smart space’ is one that automatically identifies and tracks its occupants using unobtrusive biometric modalities such as face, gait, and voice in an unconstrained fashion. Information retrieval in a smart space is concerned with information about the location of people at various points in time. Towards this end, we abstract a smart space by a p...
This paper describes a novel method for detection and extraction of contents of table cells from handwritten document images. Given a model of the table and a document image containing a table, the hand-drawn or pre-printed table is detected and the contents of the table cells are extracted automatically. The algorithms described are designed to ha...
In this work, we propose a novel multilingual word spotting framework based on Hidden Markov Models that works on corpus of multilingual handwritten documents and documents that contain more than one handwritten script. The system deals with large multilingual vocabularies without need for word or character segmentation. A keyword is represented by...
Offline Arabic handwritten text recognition task exhibits high variations in observed variables such as size, loops, slant and continuity. Learning algorithm tries to capture the statistical dependence between these variables but often fails to learn the complete distribution because of their large degree-of-freedom. However, it is possible to outp...
An important task in Keyword Spotting in handwritten documents is to separate Keywords from Non Keywords. Very often this is achieved by learning a filler or background model. A common method of building a background model is to allow all possible sequences or transitions of characters. However, due to large variation in handwriting styles, allowin...
Keyword spotting aims to retrieve all instances of a given keyword from a document in any language. In this paper, we propose a novel script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. The methodology simulates the keywords in model space as a sequence of character models and uses...
Availability of sufficient labeled data is key to the performance of any learning algorithm. However, in document analysis obtaining the large amount of labeled data is difficult. Scarcity of labeled samples is often a main bottleneck in the performance of algorithms for document analysis. However, unlabeled data samples are present in abundance. W...
With the explosive growth of the tablet form factor and greater availability of pen-based direct input, writer identification in online environments is increasingly becoming critical for a variety of downstream applications such as intelligent and adaptive user environments, search, retrieval, indexing and digital forensics. Extant research has app...
We present an framework to detect and localize activities in unconstrained real-life video sequences. This is a more challenging problem as it subsumes the activity classification problem and also requires us to work with unconstrained videos. To obtain real-life data, we have focused on using the Human Motion Database (HMDB), a collection of reali...
We discuss smart environments that identify and track their occupants using unobtrusive recognition modalities such as face, gait, and voice. In order to alleviate the inherent limitations of recognition, we propose spatio-temporal reasoning techniques based upon an analysis of the occupant tracks. The key idea underlying our approach is to determi...
In biometric systems, people may be asked to provide multiple scans for redundancy and quality control. In the case of fingerprint matching systems, repeat fingerprint probes of the same physical finger can be available and data from such multiple samples can be fused for reliable authentication of individuals. Since multiple samples are from the s...
A boosted tree classifier is proposed to segment machine printed, handwritten and overlapping text from documents with handwritten annotations. Each node of the tree-structured classifier is a binary weak learner. Unlike a standard decision tree (DT) which only considers a subset of training data at each node and is susceptible to over-fitting, we...
Linear relation is valuable in rule discovery of stocks, such as "if stock X goes up 1, stock Y will go down 3", etc. The traditional linear regression models the linear relation of two sequences perfectly. However, if user asks "please cluster the stocks in the NASDAQ market into groups where sequences have strong linear relationship with each oth...
Accent in handwriting can be defined as the influence of a writer's native script on his/her writing style in another script. In this paper, we approach the problem of detecting the existence of accents in handwriting. We approach this problem using two sets of writers, those who can write only in English, and the other set being multilingual write...
Techniques and performance of text recognition systems and software has shown great improvement in recent years. OCRs now can read any machine printed document with good accuracy. However, the advancements are primarily for Latin scripts and even for such scripts performance is limited in case of handwritten documents. Little work has been done for...
The human face forms an important interface to convey nonverbal emotional information. Facial expressions reflect an individual's reactions to personal thoughts or external stimuli. These can act as valuable supplementary biometric information to automated person identification systems. In this study, video segments of individuals were FACS coded t...
Handwriting styles are constantly changing over time. We approach the
novel problem of estimating the approximate age of Historical
Handwritten Documents using Handwriting styles. This system will have
many applications in handwritten document processing engines where
specialized processing techniques can be applied based on the estimated
age of th...
State-of-the-art techniques for writer identification have been centered
primarily on enhancing the performance of the system for writer
identification. Machine learning algorithms have been used extensively
to improve the accuracy of such system assuming sufficient amount of
data is available for training. Little attention has been paid to the
pro...
Pre-processing of scanned documents is a necessary first step in the process cycle of any document processing application. While pre-processing methods are generally language independent, the effectiveness of downstream OCR processes can often be improved by language/script specific adaptations, particularly in the case of non-Latin scripts such as...
We discuss smart environments that identify and track their occupants using unobtrusive recognition modalities such as face, gait, and voice. In order to alleviate the inherent limitations of recognition, we propose spatio-temporal reasoning techniques based upon an analysis of the occupant tracks. The key technical idea underlying our approach is...
We propose a novel generative learning framework for activity categorization. In order obtain statistical insight into the underlying patterns of motions in activities, we propose a supervised dynamic, hierarchical Bayesian model which connects low-level visual features in videos with poses, motion patterns and classes of activity. Our proposed gen...
A method for determining the delivery point codes (DPCs) for handwritten addresses is described. Determining the DPC requires locating and recognizing address components (e.g., ZIP Code, street number, P.O. box number) and using multiple information sources to assign a five, nine or eleven digit barcode (i.e., the DPC) to an address. Our method use...
Biometric authentication systems are gaining importance in this recent world prone to security threats in every field. Hand geometry verification systems use geometric measurements of hand for verification of individuals. It is believed that the combination of different features of the hand is unique for a particular person. Different hand geometry...
This paper presents a novel set of image enhancement algorithms for binary images of poorly scanned real world page documents. Problems that are targeted by the methods described include large blobs or clutter noise, salt-and-pepper noise and detection and removal of non-text objects such as form lines or rule-lines. The algorithms described are sh...
We present a method for figure-panel (subfigure) label detection and recognition in multi-panel figures extracted from biomedical articles. Figures in biomedical articles often comprise several subfigures that are identified by superimposed panel labels ('A', 'B', ...) which are referenced in the figure caption and discussion in the article body. S...
The ability to identify people and answer questions about their where abouts in a cyberphysical space is critical to many applications. Integrating recognition with spatiotemporal reasoning enhances the overall performance of information retrieval.
In some cases, the test person might be asked to provide another authentication attempt besides the first one so that combination of the two input templates might give the system more confidence if the person is genuine or impostor. Instead of simply combining the matching scores which are associated with a single person compared to the two input t...
If a biometric matching attempt does not succeed, the person might be asked to repeat authentication attempt second time. In such situations, the biometric system has acquired two test templates, and could construct a combined matching score, for example, by averaging two scores from the matches between these templates and the enrolled template. Th...
Instead of using matching scores between single enrolled and single test (or user) template, the matching scores related to all test templates or all enrolled ones can be considered to enhance the performance of biometric systems. The user-specific methods take into account the dependencies of matching scores assigned to different enrollees being m...
With the ever-increasing growth of the World Wide Web, there is an urgent need for an efficient information retrieval system that can search and retrieve handwritten documents when presented with user queries. However, unconstrained handwriting recognition remains a challenging task with inadequate performance thus proving to be a major hurdle in p...
Gesture sequences typically have a common set of distinct internal sub-structures which can be shared across the gestures.
In this paper, we propose a method using a generative model to learn these common actions which we refer to as sub-gestures,
and in-turn perform recognition. Our proposed model learns sub-gestures by sharing parameters between...
Inspired by the the behavioral scientific discoveries of Dr. Paul Ekman in relation to deceit detection, along with the television drama series Lie to Me, also based on Dr. Ekman's work, we use machine learning techniques to study the underlying phenomena expressed when a person tells a lie. We build an automated framework which detects deceit by m...
The convenience of search, both on the personal computer hard disk as well as on the web, is still limited mainly to machine printed text documents and images because of the poor accuracy of handwriting recognizers. The focus of research in this paper is the segmentation of handwritten text and machine printed text from annotated documents sometime...
Biomedical images are often referenced for clinical decision support (CDS), educational purposes, and research. They appear in specialized databases or in biomedical publications and are not meaningfully retrievable using primarily textbased retrieval systems. The task of automatically finding the images in an article that are most useful for the p...
Document binarization is one of the initial and critical steps for many document analysis systems. Nowadays, with the success and popularity of hand-held devices, large efforts are motivated to convert documents into digital format by using hand-held cameras. In this paper, we propose a Bayesian based maximum a posteriori (MAP) estimation algorithm...