Gianfranco Doretto

Gianfranco Doretto
West Virginia University | WVU · Department of Computer Science & Electrical Engineering

PhD

About

75
Publications
16,986
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,355
Citations
Additional affiliations
August 2010 - present
West Virginia University
Position
  • Professor (Assistant)
Education
July 2000 - March 2005
University of California, Los Angeles
Field of study
  • Computer Science

Publications

Publications (75)
Article
Introduction: Shock index (SI) and delta shock index (∆SI) predict mortality and blood transfusion in trauma patients. This study aimed to evaluate the predictive ability of SI and ∆SI in a rural environment with prolonged transport times and transfers from critical access hospitals or level IV trauma centers. Methods: We completed a retrospecti...
Preprint
Full-text available
The study of signatures of aging in terms of genomic biomarkers can be uniquely helpful in understanding the mechanisms of aging and developing models to accurately predict the age. Prior studies have employed gene expression and DNA methylation data aiming at accurate prediction of age. In this line, we propose a new framework for human age estima...
Preprint
Full-text available
Plant species identification in the wild is a difficult problem in part due to the high variability of the input data, but also because of complications induced by the long-tail effects of the datasets distribution. Inspired by the most recent fine-grained visual classification approaches which are based on attention to mitigate the effects of data...
Article
Background The United States, and especially West Virginia, have a tremendous burden of coronary artery disease (CAD). Undiagnosed familial hypercholesterolemia (FH) is an important factor for CAD in the U.S. Identification of a CAD phenotype is an initial step to find families with FH. Objective We hypothesized that a CAD phenotype detection algor...
Article
Modern machine learning techniques (such as deep learning) offer immense opportunities in the field of human biological aging research. Aging is a complex process, experienced by all living organisms. While traditional machine learning and data mining approaches are still popular in aging research, they typically need feature engineering or feature...
Preprint
Full-text available
Autoencoder networks are unsupervised approaches aiming at combining generative and representational properties by learning simultaneously an encoder-generator map. Although studied extensively, the issues of whether they have the same generative power of GANs, or learn disentangled representations, have not been fully addressed. We introduce an au...
Chapter
Full-text available
Deep hashing approaches are widely applied to approximate nearest neighbor search for large-scale image retrieval. We propose Spherical Deep Supervised Hashing (SDSH), a new supervised deep hashing approach to learn compact binary codes. The goal of SDSH is to go beyond learning similarity preserving codes, by encouraging them to also be balanced a...
Conference Paper
Full-text available
Deep hashing approaches are widely applied to approximate nearest neighbor search for large-scale image retrieval. We propose Spherical Deep Supervised Hashing (SDSH), a new supervised deep hashing approach to learn compact binary codes. The goal of SDSH is to go beyond learning similarity preserving codes, by encouraging them to also be balanced a...
Preprint
Full-text available
Novelty detection is the problem of identifying whether a new data point is considered to be an inlier or an outlier. We assume that training data is available to describe only the inlier distribution. Recent approaches primarily leverage deep encoder-decoder network architectures to compute a reconstruction error that is used to either compute a n...
Article
Full-text available
The quest for deeper understanding of biological systems has driven the acquisition of increasingly larger multidimensional image datasets. Inspecting and manipulating data of this complexity is very challenging in traditional visualization systems. We developed syGlass, a software package capable of visualizing large scale volumetric data with ine...
Article
Full-text available
This work provides a framework for addressing the problem of supervised domain adaptation with deep models. The main idea is to exploit adversarial learning to learn an embedded subspace that simultaneously maximizes the confusion between two domains while semantically aligning their embedding. The supervised setting becomes attractive especially w...
Conference Paper
Full-text available
This work provides a unified framework for addressing the problem of visual supervised domain adaptation and generalization with deep models. The main idea is to exploit the Siamese architecture to learn an embedding subspace that is discriminative, and where mapped visual domains are semantically aligned and yet maximally separated. The supervised...
Article
Full-text available
This work provides a unified framework for addressing the problem of visual supervised domain adaptation and generalization with deep models. The main idea is to exploit the Siamese architecture to learn an embedding subspace that is discriminative, and where mapped visual domains are semantically aligned and yet maximally separated. The supervised...
Article
Full-text available
We present a virtual reality (VR) framework for the analysis of whole human body surface area. Usual methods for determining the whole body surface area (WBSA) are based on well known formulae, characterized by large errors when the subject is obese, or belongs to certain subgroups. For these situations, we believe that a computer vision approach c...
Data
Relationship between WBSA and VBSA. (Left) Using Virtual Random dataset at θ = 0°, ϕ = 0°. (Right) Using Virtual NHANES dataset at θ = 0°, ϕ = 0°. (TIF)
Data
Residuals analysis. (a) QQ plot of residual. (b) Residuals. (c) Residuals histogram. (d) Residuals scale plot. (TIF)
Data
Supporting Information to the manuscript. Complete results of the Virtual Environment and linear regression. (PDF)
Conference Paper
Full-text available
We address the unsupervised domain adaptation problem for visual recognition when an auxiliary data view is available during training. This is important because it allows improving the training of visual classifiers on a new target visual domain when paired additional source data is cheaply available. This is the case when we learn from a source of...
Article
We address the problem of detecting and recognizing online the occurrence of human interactions as seen by a network of multiple cameras. We represent interactions by forming temporal trajectories, coupling together the body motion of each individual and their proximity relationships with others, and also sound, whenever available. Such trajectorie...
Conference Paper
Full-text available
We explore the visual recognition problem from a main data view when an auxiliary data view is available during training. This is important because it allows improving the training of visual classifiers when paired additional data is cheaply available, and it improves the recognition from multi-view data when there is a missing view at testing time...
Chapter
Full-text available
Connectomics—the study of how neurons wire together in the brain—is at the forefront of modern neuroscience research. However, many connectomics studies are limited by the time and precision needed to correctly segment large volumes of electron microscopy (EM) image data. We present here a semi-automated segmentation pipeline using freely available...
Article
Full-text available
The high-frequency region of vowel signals (above the third formant or F3) has received little research attention. Recent evidence, however, has documented the perceptual utility of high-frequency information in the speech signal above the traditional frequency bandwidth known to contain important cues for speech and speaker recognition. The purpos...
Article
Full-text available
Gait analysis for therapy regimen prescription and monitoring requires patients to physically access clinics with specialized equipment. The timely availability of such infrastructure at the right frequency is especially important for small children. Besides being very costly, this is a challenge for many children living in rural areas. This is why...
Article
Full-text available
A mobile sensor based on fringe projection techniques is developed with the goal of acquiring face 3D and color with a smartphone device. The system consists of a portable pico-projector and an Android-based smartphone. The data acquisition, pattern generation. and reconstruction of the final 3D point cloud are all driven by the smartphone. We pres...
Article
This chapter describes how the class of linear dynamical system (LDS) models can be used for representing and analyzing video signals. Given a sequence of images, it is shown under what conditions it can be modeled with an LDS, and how it is possible to estimate the model parameters (system identification). Video data become LDSs, and establishing...
Conference Paper
Full-text available
We address the problem of online temporal segmentation and recognition of human interactions in video sequences. The complexity of the high-dimensional data variability representing interactions is handled by combining kernel methods with linear models, giving rise to kernel regression and kernel state space models. By exploiting the geometry of li...
Conference Paper
Full-text available
The high degree of complexity in cellular and circuit structure of the brain poses challenges for understanding tissue organization, extrapolated from large serial sections electron microscopy (ssEM) image data. We advocate the use of 3D immersive virtual reality (IVR) to facilitate the human analysis of such data. We have developed and evaluated t...
Conference Paper
Full-text available
In this paper we model binary people interactions by forming temporal interaction trajectories, under the form of a time series, coupling together the body motion of each individual as well as their proximity relationships. Such tra-jectories are modeled with a non-linear dynamical system (NLDS). We develop a framework that entails the use of so-ca...
Patent
Full-text available
A device and method for processing an image to create appearance and shape labeled images of a person or object captured within the image. The appearance and shape labeled images are unique properties of the person or object and can be used to re-identify the person or object in subsequent images. The appearance labeled image is an aggregate of pre...
Conference Paper
Full-text available
This paper describes the design and the performance of a virtual simulation environment to evaluate a machine vision based pose estimation system used for the general problem of satellite servicing. The vision system features a wide angle monocular camera to track the interface ring of a non-cooperative satellite using ellipse extraction. The effec...
Patent
Full-text available
A device and method for processing an image to create appearance and shape labeled images of a person or object captured within the image. The appearance and shape labeled images are unique properties of the person or object and can be used to re-identify the person or object in subsequent images. The appearance labeled image is an aggregate of pre...
Patent
Aspects of the disclosure provide a method for crowd segmentation that can globally optimize crowd segmentation of an input image based on local information of the input image. The method can include receiving an input image of a site, initializing a plurality of hypothesis based on the input image, dividing the input image into a plurality of patc...
Conference Paper
Full-text available
Recent successes in the use of sparse coding for many computer vision applications have triggered the attention towards the problem of how an over-complete dictionary should be learned from data. This is because the quality of a dictionary greatly affects performance in many respects, including computational. While so far the focus has been on lear...
Article
In this work we present M-VIVIE, a system for video sequence indexing based on the identity of appearing subjects, which exploits a multithread architecture for fast processing. The system is composed of more conceptual component modules, each performing a specific kind of processing. Each module can possibly be substituted with a different one per...
Article
Full-text available
We present a novel method for jointly performing recognition of complex events and linking fragmented tracks into coherent, long-duration tracks. Many event recognition methods require highly accurate tracking, and may fail when tracks corresponding to event actors are fragmented or partially missing. However, these conditions occur frequently from...
Conference Paper
Full-text available
Recognizing the presence of object classes in an image, or image classification, has become an increasingly important topic of interest. Equally important, however, is also the capability to locate these object classes in the image. We consider in this paper an approach to these two related problems with the primary goal of minimizing the training...
Article
Full-text available
Recent advances in visual tracking methods allow following a given object or individual in presence of significant clutter or partial occlusions in a single or a set of overlapping camera views. The question of when person detections in different views or at different time instants can be linked to the same individual is of fundamental importance t...
Chapter
Full-text available
For troop and military installation protection, modern computer vision methods must be harnessed to enable a comprehensive approach to contextual awareness. In this chapter we present a collection of intelligent video technologies currently under development at the General Electric Global Research Center, which can be applied to this challenging pr...
Conference Paper
Full-text available
Transfer learning allows leveraging the knowledge of source domains, available a priori, to help training a classifier for a target domain, where the available data is scarce. The effectiveness of the transfer is affected by the relationship between source and target. Rather than improving the learning, brute force leveraging of a source poorly rel...
Conference Paper
Full-text available
This paper presents region moments, a class of appearance descriptors based on image moments applied to a pool of image features. A careful design of the moments and the image features, makes the descriptors scale and rotation invariant, and therefore suitable for vehicle detection from aerial video, where targets appear at different scales and ori...
Conference Paper
Full-text available
Intelligent video in urban settings can be challenging due the presence of crowds, clutter, poor camera placement and continuously changing light conditions. The surveillance of sports venues is particularly difficult, because thousands of people can enter or exit a venue in short periods of time. This paper presents a case study of successfully mo...
Conference Paper
Full-text available
In this work we propose a dynamic scene model to provide information about the presence of salient motion in the scene, and that could be used for focusing the attention of a pan/tilt/zoom camera, or for background modeling purposes. Rather than proposing a set of saliency detectors, we define what we mean by salient motion, and propose a precise m...
Conference Paper
Face alignment seeks to deform a face model to match it with the features of the image of a face by optimizing an appropriate cost function. We propose a new face model that is aligned by maximizing a score function, which we learn from training data, and that we impose to be concave. We show that this problem can be reduced to learning a classifie...
Conference Paper
Full-text available
This paper presents a unified approach to crowd segmentation. A global solution is generated using an Expectation Maximization framework. Initially, a head and shoulder detector is used to nominate an exhaustive set of person locations and these form the person hypotheses. The image is then partitioned into a grid of small patches which are each as...
Article
This paper presents an overview of Intelligent Video work currently under development at the GE Global Research Center and other research institutes. The image formation process is discussed in terms of illumination, methods for automatic camera calibration and lessons learned from machine vision. A variety of approaches for person detection are pr...
Article
Full-text available
In modeling complex visual phenomena one can employ rich models that characterize the global statistics of images, or choose simple classes of models to represent the local statistics of a spatiotemporal segment, together with the partition of the data into such segments. Each seg-ment could be characterized by certain statistical regularity proper...
Conference Paper
Full-text available
In this work we develop appearance models for com- puting the similarity between image regions containing de- formable objects of a given class in realtime. We introduce the concept of shape and appearance context. The main idea is to model the spatial distribution of the appearance relative to each of the object parts. Estimating the model entails...
Article
Full-text available
We propose a model of the joint variation of shape and appearance of portions of an image sequence. The model is conditionally linear, and can be thought of as an extension of active appearance models to exploit the temporal correlation of adjacent image frames. Inference of the model parameters can be performed efficiently using established numeri...
Conference Paper
Full-text available
Complete and accurate video tracking is very difficult to achieve in practice due to long occlusions, traffic clutter, shadows and appearance changes. In this paper, we study the feasibility of event recognition when object tracks are fragmented. By changing the lock score threshold controlling track termination, different levels of track fragmenta...
Conference Paper
Full-text available
We present a novel method for jointly performing recognition of complex events and linking fragmented tracks into coherent, long-duration tracks. Many event recognition methods require highly accurate tracking, and may fail when tracks corresponding to event actors are fragmented or partially missing. However, these conditions occur frequently from...
Article
Full-text available
We present a novel approach to moving object detec-tion in video taken from a translating, rotating and zoom-ing sensor, with a focus on detecting very small objects in as few frames as possible. The primary innovation is to incorporate automatically computed scene understanding of the video directly into the motion segmentation process. Scene unde...
Article
Full-text available
Dynamic scenes with arbitrary radiometry and geometry present a challenge in that a physical model of their motion, shape, and reflectance cannot be inferred. Therefore, the issue of representation becomes crucial, and while there is no right or wrong representation, the task at hand should guide the modeling process. For instance, if the task is t...
Conference Paper
Full-text available
In this work, we propose a model for video scenes that contains temporal variability in shape and appearance. We propose a conditionally linear model akin to a dynamic extension of active appearance models. We formulate the problem variationally, and propose a framework where a model complexity cost dictates the "modeling responsibility" of each of...
Conference Paper
Full-text available
We address the problem of modeling the spatial and tempo- ral second-order statistics of video sequences that exhibit both spatial and temporal regularity, intended in a statistical sense. We model such sequences as dynamic multiscale autoregressive models, and introduce an e-cient algorithm to learn the model parameters. We then show how the model...
Article
Full-text available
We address the problem of segmenting a sequence of images of natural scenes into disjoint regions that are characterized by constant spatio-temporal statistics. We model the spatio-temporal dynamics in each region by Gauss-Markov models, and infer the model parameters as well as the boundary of the regions in a variational optimization framework. N...
Conference Paper
Full-text available
We present a simple and efficient algorithm for modifying the temporal behavior of "dynamic textures," i.e. sequences of images that exhibit some form of temporal regularity, such as flowing water, steam, smoke, flames, foliage of trees in wind. The main goal is to design algorithms for synthesizing and editing realistic sequences of images of dyna...
Article
Full-text available
We present a simple and efficient algorithm for modifying the temporal behavior of "dynamic textures," i.e. sequences of images that exhibit some form of temporal regularity, such as flowing water, steam, smoke, flames, foliage of trees in wind.
Article
Full-text available
Dynamic textures are sequences of images of moving scenes that exhibit certain stationarity properties in time; these include sea-waves, smoke, foliage, whirlwind etc. We present a characterization of dynamic textures that poses the problems of modeling, learning, recognizing and synthesizing dynamic textures on a firm analytical footing. We borrow...
Article
Dynamic textures are sequences of images of moving scenes that exhibit certain stationarity properties in time; these include sea-waves, smoke, foliage, whirlwind etc. We present a characterization of dynamic textures that poses the problems of modeling, learning, recognizing and synthesizing dynamic textures on a firm analytical footing. We borrow...