Kuntal Ghosh

Indian Statistical Institute, Baranagore, Bengal, India

Are you Kuntal Ghosh?

Claim your profile

Publications (40)21.75 Total impact

  • Ashish Bakshi, Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: Mach bands are the pronounced light and dark bands visible where a luminance plateau meets a ramp as in a penumbra. A great deal of effort has been devoted to study these in order to understand the underlying neural circuitry. A number of theoretical models, linear and non-linear, have consequently been proposed starting from the seminal studies of Ernst Mach himself. In this work we demonstrate why no linear model of visual perception can explain the Mach band illusion although many such attempts have been made starting from that of Mach to some recent ones. From the same approach, we also systematically demonstrate why the Mach bands are weak or inexistent at step changes of intensity. A new aspect, viz. the scaling properties of the widths of Mach band has been studied to provide a unified approach to solve both these problems in vision.
    Perception and Machine Intelligence - First Indo-Japan Conference, PerMIn 2012, Kolkata, India, January 12-13, 2012. Proceedings; 01/2012
  • Ashish Bakshi, Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper we present some demonstrations concerning the width of Mach bands and henceforth hypothesize certain relations. We show that it is the variation in width of Mach bands in relation to luminance gradients which is responsible for Mach bands being strong for luminance ramps and weak or vanishing for luminance steps. We present the results of the experiments carried out by us using some of these demonstrations to provide support for our claims.
    Perception 01/2012; 41(11):1403-8. DOI:10.1068/p7358 · 1.11 Impact Factor
  • Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: It is a well-known fact that the perceived brightness of any surface depends on the brightness of the surfaces that surround it. This phenomenon is termed as brightness induction. Isotropic arrays of multi-scale DoG (Difference of Gaussians) as well as cortical Oriented DoG (ODOG) and extensions thereof, like the Frequency-specific Locally Normalized ODOG (FLODOG) functions have been employed towards prediction of the direction of brightness induction in many brightness perception effects. But the neural basis of such spatial filters is seldom obvious. For instance, the visual information from retinal ganglion cells to such spatial filters, which have been generally speculated to appear at the early stage of cortical processing, are fed by at least three parallel channels viz. Parvocellular (P), Magnocellular (M) and Koniocellular (K) in the subcortical pathway, but the role of such pathways in brightness induction is generally not implicit. In this work, three different spatial filters based on an extended classical receptive field (ECRF) model of retinal ganglion cells, have been approximately related to the spatial contrast sensitivity functions of these three parallel channels. Based on our analysis involving different brightness perception effects, we propose that the M channel, with maximum conduction velocity, may have a special role for an initial sensorial perception. As a result, brightness assimilation may be the consequence of vision at a glance through the M pathway; contrast effect may be the consequence of a subsequent vision with scrutiny through the P channel; and the K pathway response may represent an intermediate situation resulting in ambiguity in brightness perception. The present work attempts to correlate this phenomenon of pathway selection with the complementary nature of these channels in terms of spatial frequency as well as contrast.
    Seeing and perceiving 01/2012; 25(2):179-212. DOI:10.1163/187847612X629946 · 1.14 Impact Factor
  • Kuntal Ghosh, Anirban Roy
    [Show abstract] [Hide abstract]
    ABSTRACT: An extra-classical receptive field (ECRF) based mid-level model of visual processing which we refer to as the neuro-visually inspired model of figure-ground segregation (NFGS) is proposed in this work. It is inspired by the non-linear interaction of the classical receptive field (CRF) and its non-classical extended surround, comprising of the non-linear mean increasing and decreasing sub-units. A simple digital kernel has been derived that performs efficient mid-level block representation.
  • Apurba Das, Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: Computerized human face recognition is a complex task of deformable pattern recognition. The principal source of complexities lies in the significant inter-class overlapping of faces due to the variations caused by different poses, illuminations, and expressions (PIE). The popularly used computerized face recognition algorithms like PCA, EBGM etc. are fairly reliable to determine facial attributes from an image. But, in most of the cases the features are extracted in terms of gray textures. When the database size is tuned to millions, then huge processing time is required, as each of the pixel must be represented using at least eight bits. In the present paper, our objective is to minimize the processing time by reducing the number of bits to represent each pixel. This we have done by combining two methods. The first one is a neuro-visually inspired method of figure-ground segregation (NFGS) which can convert the entire face image into a binary 2D array, efficiently. The second one is the scale invariant feature transform (SIFT) which extracts the scale invariant and rotation invariant features from the binarized face image and thereafter matches the features. The proposed algorithm is found successful in actually enhancing the performance of face matching. Psycho-visual experiments also corroborate the fact.
  • Apurba Das, Anirban Roy, Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: In the central visual pathway originating from the eye, a bridging is required between two hierarchical tasks, that of pixel based information recording by visual pathway at low level on one hand and that of object recognition at high level on the other. Such a bridge which may be designated as a mid-level block-grained integration has here been modeled by a multi-layer flexible cellular neural network (F-CNN). The proposed CNN architecture is validated by different intermediate level tasks involving rigid and deformable pattern recognition. Execution of such tasks by the proposed architecture, it has been shown, is capable of generating valid and significant inputs for the WHERE (dorsal) and WHAT (ventral) pathways in the brain. The model includes the proposal of a feedback (also by CNN architecture) to the lower mid-level from the higher mid-level dorsal and ventral pathways for flexible cell (physiological receptive field) size adjustment in the primary visual cortex towards successful ‘where' and ‘what' identifications for high-level vision.
    Swarm, Evolutionary, and Memetic Computing - Second International Conference, SEMCCO 2011, Visakhapatnam, Andhra Pradesh, India, December 19-21, 2011, Proceedings, Part I; 01/2011
  • Chandrani Saha, Kuntal Ghosh
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper presents a method of estimation of facial expression intensity from a sequence of binary facial images obtained from video. The binarization has been done using a neuro-visual model of figure ground segregation. The Local Binary Pattern (LBP) is taken as characteristic feature of a face with expression. This pattern gets evolved in the temporal domain over the sequence. The dynamics of the pattern, starting from a neutral face, is characterised by Hausdorff distance. Back Propagation (BP) Neural Networks are trained to estimate the expression intensity level of the basic expressions.
  • Kuntal Ghosh, Sankar K. Pal
    [Show abstract] [Hide abstract]
    ABSTRACT: The excitatory-inhibitory visual receptive-field model may be looked upon as a classical structuralist approach to vision that relies upon brightness-contrast information of the image as a preliminary step toward visual representation. The corresponding mathematical operator (Laplacian) was first proposed by the empiricist Ernst Mach on the basis of the Mach band illusion. The Helmholtz's constructivist approach, on the other hand, argues that perception is the product of unconscious inference. Propagating a sort of intermediate stance between these two viewpoints led to the emergence of the Gestalt school for whom perception follows a minimum principle and is at the same time holistic, based on certain coherence criteria. In this paper, we have modeled the extraclassical receptive field through an eigenfunction-based generalization of the Gaussian derivative approach that resulted in a modification of Mach's equation, introducing a higher order isotropic derivative (Bi-Laplacian) of Gaussian and a fourth-moment operator. The proposed computational model draws its inspiration from the structuralist approach, performs figure-ground segregation in Gestalt sense, and also provides cues toward brightness perception in tune with the constructivist notion of unconscious inference.
    IEEE Transactions on Systems Man and Cybernetics - Part A Systems and Humans 07/2010; 40(4):758-766. DOI:10.1109/TSMCA.2010.2044503 · 2.18 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Traditionally the intensity discontinuities in an image are detected as zero-crossings of the second derivative with the help of a Laplacian of Gaussian (LOG) operator that models the receptive field of retinal Ganglion cells. Such zero-crossings supposedly form a raw primal sketch edge map of the external world in the primary visual cortex of the brain. Based on a new operator which is a linear combination of the LOG and a Dirac-delta function that models the extra-classical receptive field of the ganglion cells, we find that zero-crossing points thus generated, store in presence of noise, apart from the edge information, the shading information of the image in the form of density variation of these points. We have also shown that an optimal image contrast produces best mapping of the shading information to such zero-crossing density variation for a given amount of noise contamination. Furthermore, we have observed that an optimal amount of noise contamination reproduces the minimum optimal contrast and hence gives rise to the best representation of the original image. We show that this phenomenon is similar in nature to that of stochastic resonance phenomenon observed in psychophysical experiments.
    Biological Cybernetics 05/2009; 100(5):351-9. DOI:10.1007/s00422-009-0306-9 · 1.93 Impact Factor
  • Source
    S. Sarkar, S. Karmakar, Kuntal Ghosh, Swapan Sen
    [Show abstract] [Hide abstract]
    ABSTRACT: The problem of conventional window based FIR filter design lies in its very limited design flexibility and more specifically the lack of control over band edges. We propose an alternative Gaussian window approach for FIR filter design that overcomes these problems of conventional window method. We show that sum of mean shifted Gaussians can be used for flexible filter design. We also derive relations to compute the corresponding impulse response effectively in a non-recursive manner. These relations give precise control over band-edge frequencies. Comparison of precision in control and computational time with other methods is also presented.
    Signal Processing, Communications and Networking, 2008. ICSCN '08. International Conference on; 02/2008
  • S. Karmakar, K. Ghosh, S. Sarkar
    [Show abstract] [Hide abstract]
    ABSTRACT: A generalized methodology of constructing a Mexican hat wavelet family involving even order Gaussian derivatives has been devised in a Gaussian scale space only. The optimization has been carried out in Fourier domain and the kernels in Gaussian scale space domain are found to be exact replica of their derivative wavelet counterpart for low as well as high order. Wavelet properties of the lowest order (2), has been discussed and the results are shown to be better and different from the well-known LOG-DOG equivalence of Marr-Hildreth. Such filters, simple to implement in Gaussian scale space, are likely to be important in vision, analysis of seismic signals, cosmic microwave background (CMB) maps and possibly in the general cases of signals from Gaussian point sources.
    Signal Processing, Communications and Networking, 2008. ICSCN '08. International Conference on; 02/2008
  • [Show abstract] [Hide abstract]
    ABSTRACT: The present work is aimed at understanding and explaining some of the aspects of visual signal processing at the retinal level while exploiting the same towards the development of some simple techniques in the domain of digital image processing. Classical studies on retinal physiology revealed the nature of contrast sensitivity of the receptive field of bipolar or ganglion cells, which lie in the outer and inner plexiform layers of the retina. To explain these observations, a difference of Gaussian (DOG) filter was suggested, which was subsequently modified to a Laplacian of Gaussian (LOG) filter for computational ease in handling two-dimensional retinal inputs. Till date almost all image processing algorithms, used in various branches of science and engineering had followed LOG or one of its variants. Recent observations in retinal physiology however, indicate that the retinal ganglion cells receive input from a larger area than the classical receptive fields. We have proposed an isotropic model for the non-classical receptive field of the retinal ganglion cells, corroborated from these recent observations, by introducing higher order derivatives of Gaussian expressed as linear combination of Gaussians only. In digital image processing, this provides a new mechanism of edge detection on one hand and image half-toning on the other. It has also been found that living systems may sometimes prefer to "perceive" the external scenario by adding noise to the received signals in the pre-processing level for arriving at better information on light and shade in the edge map. The proposed model also provides explanation to many brightness-contrast illusions hitherto unexplained not only by the classical isotropic model but also by some other Gestalt and Constructivist models or by non-isotropic multi-scale models. The proposed model is easy to implement both in the analog and digital domain. A scheme for implementation in the analog domain generates a new silicon retina model implemented on a hardware development platform.
    Progress in brain research 02/2008; 168:175-91. DOI:10.1016/S0079-6123(07)68015-7 · 5.10 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The objective of this study is to establish that subjective evaluation of fatty as well as normal ultrasound human liver images based on echotexture (spatial pattern of echoes) and echogenicity by visual inspection can be corroborated by Haralick's statistical texture analysis. Seventy-six ultrasound scan images of human normal livers and twenty-four ultrasound images of fatty livers as identified by the radiologist on the basis of echotexture and echogenecity, have been collected from hospital for this study. An unsupervised neural network learning technique, namely, Self Organising Map (SOM) has been employed to generate profile plots. Using Student's t like statistic for each feature as a measure of distinction between normal and fatty livers, two most appropriate features, namely, maximum probability (Maxp) and uniformity (Uni) are selected from this profile plots. These two features are found to form clusters with little overlap for normal and fatty livers. Thus statistical texture analysis of the ultrasound human images using 'Maxp" and "Uni" presented the best results for corroborating the classification as made the radiologist by visual inspection. This work may be a humble beginning to model the radiologists' perceptual findings that may emerge in future as a new tool with respect to 'ultrasonic biopsy'.
    Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on; 01/2008
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Post processing of medical images often needs interpolation. Taking cues from human visual system, we propose here an interpolation kernel consisting of linear combination of Gaussians at different scales. We compare the efficacy of the proposed kernel with other interpolation kernels, particularly in the processing of medical images. The basic algorithm has been implemented on a TI DM642 based hardware platform for realtime filtering and programmed for post-processing of ultrasound video frames (20 fames/s) from the commercially available Siemens medical ultrasound scanner.
    Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on; 01/2008
  • K. Ghosh, S. Sarkar, K. Bhaumik
    01/2008; 17(1-3). DOI:10.1515/JISYS.2008.17.1-3.229
  • [Show abstract] [Hide abstract]
    ABSTRACT: We are proposing a biologically inspired multi-scale derivative filter in which the higher order derivatives are expressed as a linear combination of a smoothing function at various scales. One of the functions in the summation has been approximated to a Dirac-delta function to finally yield the new filter. This modification has some support from the point of view of authentic edge detection as well as from neurophysiological and psychophysical experiments at the retinal level. Besides, it improves the quality of the filter in a number of ways. The proposed filter can be optimized at any desired scale. Hence it is very effective in extracting the features from a noisy picture. The filter is rotationally symmetric. Zero-crossing map of any picture filtered with the proposed model gives a half-toning effect to the retrieved image and hence preserves the intensity information in the image even in the edge map.
    Image and Vision Computing 08/2007; 25(8-25):1228-1238. DOI:10.1016/j.imavis.2006.07.022 · 1.58 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The theory of edge detection and the treatise on low-level vision presented in this chapter in the light of the non-classical receptive field of retinal ganglion cells is a straightforward continuation of the approach of David Marr and his group. The appeal of the present approach lies in its simplicity and easy implementation, although it should be kept in mind that no non-linear model of the extended surround has been proposed here, which could be an interesting direction of future work. Potential applications of the algorithm will include apart from areas of general edge enhancement, designing new robust visual capturing or
    Vision Systems: Segmentation and Pattern Recognition, 06/2007; , ISBN: 978-3-902613-05-9
  • 01/2007
  • Kuntal Ghosh, Sankar K. Pal
    01/2007
  • Kuntal Ghosh, Sankar K. Pal
    [Show abstract] [Hide abstract]
    ABSTRACT: The Spotlight Models of attention that rely upon a bottom-up approach specifically through the dorsal pathways, can be modeled using multi-scale Gaussian pyramids with excitatory-inhibitory feedforward cellular neural networks (CNN) as feature detectors. Here we propose a modified disinhibitory zero-feedback CNN model derived out of a linear combination of three Gaussians only, that explains many brightness perception based psychophysical phenomena unexplainable with the old model and in the process predicts three different input cloning templates for global smoothing, global enhancement, as well as controlled smoothing and enhancement of retinal images within the focus of attention. The proposed approach provides new clues, based on the psychophysical stimuli, suggestive of a role of top-down attentional control possibly through the ventral pathways, even at the stage of low-level vision.
    Attention in Cognitive Systems. Theories and Systems from an Interdisciplinary Viewpoint, 4th International Workshop on Attention in Cognitive Systems, WAPCV 2007, Hyderabad, India, January 8, 2007, Revised Selected Papers; 01/2007