Figure 2 - uploaded by Gerhard Roth
Content may be subject to copyright.
A sample of a single resonant feature, C a r , comprised of 2 hypercomplex features (C a h and C b h ) and 6 complex features (C a c0 . . . C a c2 and C b c0 . . . C b c2 , where C a c2 = C b c2 ). The grid represents a simple cell field arranged in cortical columns, on top of which complex cell pooling occurs. Red areas represent active hypercomplex cells and orange areas represent active simple cells that have been pooled by complex cells (represented by lines). The primary complex cells, C a c0 and C b c0 , are associated with the largest angles in their respective hypercomplex cells. The cortical column is used to arrange complex cells in a clockwise manner for rotation invariance and to normalize complex cell length for scale invariance. The grid is represented as a texture on the GPU.
Source publication
We present a biologically motivated classifier and feature descriptors that are designed for execution on single instruction multi data hardware and are applied to high speed multiclass object recognition. Our feature extractor uses a cellular tuning approach to select the optimal Gabor filters to process a given input, followed by the computation...
Contexts in source publication
Context 1
... orientation of the traversal path is α i and reflects the orientation of the underlying object that is activating the complex cells. The orientation difference between any two complex cell fea- tures is α ij = abs(α i − α j ); examples of this can be seen in Figure 2. Rotation invariance is achieved by determining the largest α ij value and setting C a ci as the primary complex cell. ...
Similar publications
In this paper we propose a supervised object recognition method using new
global features. The proposed technique, based on the Fourier transform
evaluated on a regular hexagonal grid, allows extracting descriptors which are
invariant to geometric transformations (rotations, scale invariant,
translations...). The obtained descriptors are next used...
Citations
... Parks et al. [19] presented a CUDA implementation of a saliency system for detection and the HMAX model for recognition, both steps are 10 times faster compared with the original algorithms. Woodbeck et al. [20] presented a GPU implementation of a bio-inspired model-similar to the HMAX model-using the OpenGL framework that achieves speedups of up to three orders of magnitude. Note that these last three works are based on the HMAX model, a region-based visual feature system, which is similar to our proposed model called the AVC algorithm. ...
The need for highly accurate classification systems capable of working on real-time applications has increased in recent years. Nowadays, several computer vision tasks apply a classification step as part of bigger systems, hence requiring classification models that work at a fast pace. This rendered interesting the concept of real-time object classification to several research communities. In this paper, we propose to accelerate a bio-inspired model for object classification, which has given very good results when compared with other state-of-the-art proposals using the compute unified device architecture (CUDA) and exploiting computational capabilities of graphic processing units. The classification model that is used is called the artificial visual cortex, a novel bio-inspired approach for image classification. In this work, we show that through an implementation of this model in the CUDA framework it is possible to achieve real-time functionality. As a result, the proposed system is able to process images in average of up to 90 times faster than the original system.
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. ...
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. ...
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. It is therefore advocated that the same set of kernels present at the first layer (i.e. ...
This document will review the most prominent proposals using multilayer convolutional architectures. Importantly, the various components of a typical convolutional network will be discussed through a review of different approaches that base their design decisions on biological findings and/or sound theoretical bases. In addition, the different attempts at understanding ConvNets via visualizations and empirical studies will be reviewed. The ultimate goal is to shed light on the role of each layer of processing involved in a ConvNet architecture, distill what we currently understand about ConvNets and highlight critical open problems.
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. ...
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. ...
... Many variations of the above underlying ideas have been proposed, including various learning strategies at higher layers [145,147], wavelet based filters [71], different feature sparsification strategies [73,110,147] and optimizations of filter parameters [107,147]. Yet another body of research, advocates that the hierarchical processing (termed F ilter → Rectif y → F ilter) that takes place in the visual cortex deals progressively with higher-order image structures [5,48,108]. It is therefore advocated that the same set of kernels present at the first layer (i.e. ...
This document will review the most prominent proposals using multilayer convolutional architectures. Importantly, the various components of a typical convolutional network will be discussed through a review of different approaches that base their design decisions on biological findings and/or sound theoretical bases. In addition, the different attempts at understanding ConvNets via visualizations and empirical studies will be reviewed. The ultimate goal is to shed light on the role of each layer of processing involved in a ConvNet architecture, distill what we currently understand about ConvNets and highlight critical open problems.
... In the literature, several simulation frameworks for neural architectures on GPUs are presented. The implementation on graphics cards of biologically motivated classiEers and feature descriptors, which model the ''What'' pathway of the visual cortex, are described in Woodbeck et al. (2008), Brumby et al. (2010) and Nere et al. (2011. These works demonstrate the efficacy of GPUs in simulating such kind of cortical networks. ...
The intrinsic parallelism of visual neural architectures based on distributed hierarchical layers is well suited to be implemented on the multi-core architectures of modern graphics cards. The design strategies that allow us to optimally take advantage of such parallelism, in order to efficiently map on GPU the hierarchy of layers and the canonical neural computations, are proposed. Specifically, the advantages of a cortical map-like representation of the data are exploited. Moreover, a GPU implementation of a novel neural architecture for the computation of binocular disparity from stereo image pairs, based on populations of binocular energy neurons, is presented. The implemented neural model achieves good performances in terms of reliability of the disparity estimates and a near real-time execution speed, thus demonstrating the effectiveness of the devised design strategies. The proposed approach is valid in general, since the neural building blocks we implemented are a common basis for the modeling of visual neural functionalities.
... There are some other object recognition models that have used the Adaptive Resonance Theory. For example, Woodbeck et al. [23] proposed a biologically plausible hierarchical structure which was an extension of the sparse localized features (SLF) suggested by Mutch et al. [24]. One of their contributions was that, instead of using support vector machines (SVM) for classification, they used Fuzzy ARTMAP as a biologically plausible multiclass classifier [25] which is based on the Adaptive Resonance Theory (ART). ...
The brain mechanism of extracting visual features for recognizing various objects has consistently been a controversial issue in computational models of object recognition. To extract visual features, we introduce a new, biologically motivated model for facial categorization, which is an extension of the Hubel and Wiesel simple-to-complex cell hierarchy. To address the synaptic stability versus plasticity dilemma, we apply the Adaptive Resonance Theory (ART) for extracting informative intermediate level visual features during the learning process, which also makes this model stable against the destruction of previously learned information while learning new information. Such a mechanism has been suggested to be embedded within known laminar microcircuits of the cerebral cortex. To reveal the strength of the proposed visual feature learning mechanism, we show that when we use this mechanism in the training process of a well-known biologically motivated object recognition model (the HMAX model), it performs better than the HMAX model in face/non-face classification tasks. Furthermore, we demonstrate that our proposed mechanism is capable of following similar trends in performance as humans in a psychophysical experiment using a face versus non-face rapid categorization task.
This paper introduces mathematical formalism for Spatial Pooler (SP) of Hierarchical Temporal Memory (HTM) with a spacial consideration for its hardware implementation. Performance of HTM network and its ability to learn and adjust to a problem at hand is governed by a large set of parameters. Most of parameters are codependent which makes creating efficient HTM-based solutions challenging. It requires profound knowledge of the settings and their impact on the performance of system. Consequently, this paper introduced a set of formulas which are to facilitate the design process by enhancing tedious trial-and-error method with a tool for choosing initial parameters which enable quick learning convergence. This is especially important in hardware implementations which are constrained by the limited resources of a platform.
In this paper, a distributed approach is developed for achieving large-scale classifier training and image classification. First, a visual concept network is constructed for determining the inter-related learning tasks automatically, e.g., the inter-related classifiers for the visually similar object classes in the same group should be trained in parallel by using multiple machines to enhance their discrimination power. Second, an MPI-based distributed computing approach is constructed by using a master–slave mode to address two critical issues of huge computational cost and huge storage/memory cost for large-scale classifier training and image classification. In addition, an indexing-based storage method is developed for reducing the sizes of intermediate SVM models and avoiding the repeated computations of SVs (support vectors) in the test stage for image classification. Our experiments have also provided very positive results on 2010 ImageNet database for Large Scale Visual Recognition Challenge.
The spread of graphics processing unit (GPU) computing paved the way to the possibility of reaching high-computing performances in the simulation of complex biological systems. In this work, we develop a very efficient GPU-accelerated neural library, which can be employed in real-world contexts. Such a library provides the neural functionalities that are the basis of a wide range of bio-inspired models, and in particular, we show its efficacy in implementing a cortical-like architecture for visual feature coding and estimation. In order to fully exploit the intrinsic parallelism of such neural architectures and to manage the huge amount of data that characterizes the internal representation of distributed neural models, we devise an effective algorithmic solution and an efficient data structure. In particular, we exploit both data parallelism and task parallelism, with the aim of optimally taking advantage from the computational capabilities of modern graphics cards. Moreover, we assess the performances of two different development frameworks, both supplying a wide range of basic signal processing GPU-accelerated functions. A systematic analysis, aiming at comparing different algorithmic solutions, shows the best data structure and parallelization computational scheme to compute features from a distributed population of neural units.
In this paper, we present a real time biologically motivated 3D motion classifier cells integrating the depth information generated from a stereo input implemented in an active vision system. The proposed approach is accurately able to detect and estimate multiple interfered 3D complex motions under the absence of predefined spatial coherence. Moreover, the system has ability to examine the response of input 3D motion vector fields to a certain 3D motion patterns (3D motion classifier cells) such as motion in the Z direction representing movements towards the system, which is very important to overcome typical problem in autonomous mobile robotic vision such as collision detection and inhibition of the ego-motion defects of a moving camera head. The output of the algorithm is part in a multi-object segmentation approach implemented in an active vision system.