Figure - available from: Biological Cybernetics
This content is subject to copyright. Terms and conditions apply.
Measures of the second-stage network as it trained with a 120 FPS video of a red car moving right to left through the scene. a A sample 500×500\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$500 \times 500$$\end{document} video frame at 1.8 s after the beginning of second-stage training. b Time evolution of inhibitory weights from column 1 of the weight matrix T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{T}}$$\end{document} as the network trained, representing inhibition from the “left” neuron to all other neurons. Translucent gray boxes in this panel and the next indicate when the car was entering the frame from approximately 0.1–0.6 s after the start of training, and when the car was leaving the frame at approximately 2.8–3.2 s. Note that most of the changes in connection weights occurred as the car entered and left the scene. c Network outputs over the 3.4 s duration of training with this stimulus. A positive leftward motion output clearly dominates early in training, and becomes the largest negative output as the car leaves. d Final state of the weight matrix T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{T}}$$\end{document} after 3.4 s of training. Brighter colors represent larger values, and darker colors smaller values. The maximum value in this matrix is 0.53. e The weight matrix T\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\varvec{T}}$$\end{document} after normalization to its maximum value and removal of weights less than 10% of the maximum. Only one column has nonzero weights, representing a single object a Sample video frame b Inhibitory weights from neuron 1 (left) c Network outputs during learning d Raw weight matrix e Thresholded weight matrix (color figure online)
Source publication
Visual binding is the process of associating the responses of visual interneurons in different visual submodalities all of which are responding to the same object in the visual field. Recently identified neuropils in the insect brain termed optic glomeruli reside just downstream of the optic lobes and have an internal organization that could suppor...
Citations
We have developed a neural network model capable of performing visual binding inspired by neuronal circuitry in the optic glomeruli of flies: a brain area that lies just downstream of the optic lobes where early visual processing is performed. This visual binding model is able to detect objects in dynamic image sequences and bind together their respective characteristic visual features—such as color, motion, and orientation—by taking advantage of their common temporal fluctuations. Visual binding is represented in the form of an inhibitory weight matrix which learns over time which features originate from a given visual object. In the present work, we show that information represented implicitly in this weight matrix can be used to explicitly count the number of objects present in the visual image, to enumerate their specific visual characteristics, and even to create an enhanced image in which one particular object is emphasized over others, thus implementing a simple form of visual attention. Further, we present a detailed analysis which reveals the function and theoretical limitations of the visual binding network and in this context describe a novel network learning rule which is optimized for visual binding.