Edgardo Molina's research while affiliated with City College of New York and other places

Publications (13)

Chapter
Registration of video images shares many steps with registration of still images. These steps include feature selection/detection, feature correspondence/ matching, and image alignment/registration. A typical video registration method has the following three components: motion modeling, image alignment, and image composition. This chapter first dis...
Article
Capturing aerial imagery at high resolutions often leads to very low frame rate video streams, well under Full Motion Video standards, due to bandwidth, storage, and cost constraints. Low frame rates make registration difficult when an aircraft is moving at high speeds or when GPS contains large errors or it fails. We present a method that takes ad...
Article
Propose ‐ The purpose of this paper is to propose a local orientation and navigation framework based on visual features that provide location recognition, context augmentation, and viewer localization information to a blind or low-vision user. Design/methodology/approach ‐ The authors consider three types of "visual noun" features: signage, visual-...
Conference Paper
We propose a local orientation and navigation framework based on visual features that provide location recognition, context augmentation, and viewer localization information to a human user. Mosaics are used to map local areas to ease user navigation through streets and hallways, by providing a wider field of view (FOV) and the inclusion of more de...
Article
In both military and civilian applications, abundant data from diverse sources captured on airborne platforms are often available for a region attracting interest. Since the data often includes motion imagery streams collected from multiple platforms flying at different altitudes, with sensors of different field of views (FOVs), resolutions, frame...
Conference Paper
In this paper we propose a fast method for constructing multi-view stereo panoramas using a layering approach. Constructing panoramas requires accurate camera pose estimation and will often require an image blending or interpolation method to generate seamless results. We use a registration error correction method that provides globally corrected a...
Conference Paper
Full-text available
Both panoramic and multimodal imaging are becoming more and more desirable in applications such as wide area surveillance, robotics, mapping and entertainment. In this paper, we build a precise rotating platform to generate co-registered multimodal and multiview 3D panoramic images. The platform consists of a pair of color and thermal cameras that...
Article
Circular aerial video provides a persistent view over a scene and generates a large amount of imagery, much of which is redundant. The interesting features of the scene are the 3D structural data, moving objects, and scenery changes. Mosaic-based scene representations work well in detecting and modeling these features while greatly reducing the amo...
Article
3D models of large-scale scenes available on the Internet today are largely manually created. Thus it takes a long time to create them for cities and update them as those cities that are already modeled continue to change. Multiple parallel-perspective mosaics can be generated from video automatically and more efficiently and can be used to reconst...
Conference Paper
Full-text available
Recent improvements in laser vibrometry and day/night infrared (IR) and electro-optical (EO) imaging technology have created the opportunity to create a long-range multimodal surveillance system. This multimodal capability would greatly improve security force performance through clandestine listening of targets that are probing or penetrating a per...

Citations

... Mosaics can be used to map local areas to ease user navigation through streets and hallways, by providing a wider field of view (FOV) and the inclusion of more decisive features. Besides, audio support, visual support can be considered as signage, visual-text, and visual-icons for augmenting environments (Molina et al., 2012). ...
... These technological advancements have led many researchers to investigate the possibility to create a novel approach to SLAM that combines the scale information of 3D depth sensing with the strengths of visual features to create dense 3D environment representations ( Endres et al., 2012 ). RGB-D sensors are also used to detect "visual noun" features: signage, visual text, and visual icons that are proposed as a low cost method for augmenting environments or for navigation assistance ( Molina et al., 2013 ). On the other side, the use of RGB-D cameras increases the computational load and then more computationally efficient algorithms have to be introduced in the SLAM pipeline ( Lee and Medioni, 2015 ). ...
... As demonstrated by the huge amount of research work on the topic, image mosaicing is an established approach for autonomous or remotely mapping and real-time visualization. The technique has been indeed utilized as aid for robot path planning, navigation, and mapping on land (Kelly, 2000;Lucas et al., 2010;Wang et al., 2019) and underwater (Eustice, 2005;Gracias et al., 2003); for environmental monitoring through georeferenced video registration without a digital elevation model -DEM from unmanned aerial vehicles (Zhu et al., 2005); with video acquired with large format aerial vehicles (Molina and Zhu, 2014), for surveillance (Yang et al., 2015) and tracking of moving objects (Linger and Goshtasby, 2014); for constructing an overview of a target area with different sensors (RGB and/or thermal cameras) with and without metadata from the GPS and inertial navigation system (INS) from small-scale UAV (Yahyanejad, 2013); for supporting image interpretation and navigation in medical applications with microscopes (Loewke et al., 2010). In this work we describe a novel optimized approach of image mosaicing primarily tailored for the observation and real time mapping of the seabed. ...
... (5) C i = 2, if the region is a static patch with reliable plane parameters (see (6)); C i = 1, if the region is a moving target (therefore with mi, see (7)); C i = 0, otherwise (unreliable, maybe occluded regions). Therefore the total data amount is (without counting C i ) N color + N boundary + N neighbor + N structure + N motion = 3N + (8N+3G/8) + 4JN + 4*4N+4M*N m = (27+4J)N+3G/8+4MN m (bytes) (24) when each of the motion and structure parameters needs 4 bytes. In the above equation, N m is the number of moving regions (which is much smaller than the total region number N). ...
... The multimodal sensor-based system is an example of the hardware based method. It detects and identifies the objects of interest by using pyroelectricity infrared (PIR) sensors, audio sensors, or Doppler sensors [4][5][6]. Most of the software-based methods adopt the use of digital image processing and computer vision [1,, which have the advantages of using cheaper camera systems and the obtaining of more detailed information of the object than the hardware-based ones. ...
... A common technique used to develop a stereo panorama is to position two standard cameras either beside or one above the other (Gledhill, 2003). Examples of such systems are those developed by Qu et al. (2010), Lin et al. (2008), Varshosaz & Amini (2007), Jiang et al. (2006) and Huang and Hung (1997). In such systems, two single panoramas are produced and combined. ...