ArticlePDF Available

An Isotropic 3x3 Image Gradient Operator

Authors:
  • IS_Consulting (HP Labs Retired - 8Mar13)
History and Definition of the so-called "Sobel Operator",
more appropriately named the
Sobel-Feldman Operator
by Irwin Sobel
February 2, 2014
Updated June 14 2015
The work was never "published" by me... however it was first
described and credited in a footnote "suggested by I. Sobel" in
the book:
- Duda,R. and Hart,P., Pattern Classification and Scene Analysis,
John Wiley and Sons,'73, pp271-2
The "suggestion" was made in a talk:
Sobel, I., Feldman, G., "A 3x3 Isotropic Gradient Operator for Image
Processing", presented at the Stanford Artificial Intelligence Project
(SAIL) in 1968.
There also a detailed historical account that I gave to
Prof. Per-Erik Danielsson
Institutionen for Systemteknik
Department of Electrical Engineering
Universitetet i Linkoping
S-581 83 Linkoping, Sweden
email: ped@isy.liu.se
which he kindly published as an appendix to a paper of his:
Danielsson, P.E., Seger, O., "Generalized and Separable Sobel Operators",
in "Machine vision for three-dimensional scenes",
Herbert Freeman (ed), Academic Press (1990).
******************************************************************************
******************************* correspondence follows ***********************
******************************************************************************
[corrected by IES 15Dec94 - G' scale factor is 4 not 2]
February 6, 1989
Prof. Per-Erik Danielsson
Institutionen for Systemteknik
Department of Electrical Engineering
Universitetet i Linkoping
S-581 83 Linkoping
Sweden
Dear Prof. Danielsson,
...
The history of the "Sobel Operator" according to my best
recollection is as follows:
In 1968, while a PhD candidate at the Stanford Artificial
Intelligence Project I gave a talk, together with Gary Feldman
(another graduate student and good friend of mine) on a
relatively isotropic 3x3 gradient operator. This talk was
presented at a time when the major piece of published work
on computer vision was Larry Roberts' PhD Thesis from MIT
wherein he defined a 2x2 gradient estimator then referred to
as the "Roberts Cross" operator. I had previously thought
up the operator via a line of reasoning presented in the
accompanied short document and discussed it with Gary who
enthusiastically proceeded to help me program it and test
it. After doing so and satisfying ourselves that it gave
at least visually desireable results, we presented these in
a seminar at the Stanford Artificial Intelligence Project
where we were employed as Research Assistants.
As this was an event that occurred more than 20 years ago my
memory of it is somewhat foggy. Faculty that I most clearly
remember in attendance at the seminar were Raj Reddy, John
McCarthy and Arthur Samuels. I'm pretty sure that Peter
Hart and/or Dick Duda and possibly Nils Nilsson from SRI
were also there. Lester Earnest, project executive officer
was most likely there. I'm pretty sure Karl Pingle who was
employed as a project programmer and later wrote an edge
follower incorporating the operator was there. Manfred
Hueckel, a graduate student who later wrote a paper on a
more robust and computationally expensive "edge detector",
was I think also there. Lynn Quam, Jay "Marty" Tenenbaum,
Gunnar Grape, Gil Falk, Dave Poole, and Phil Petit were
other graduate students with the project and either were at
the seminar or were working in such close proximity that
they knew of the results.
My synopsis of what ensued was that Raj Reddy, who was then
teaching one of the first courses on Computer Vision, coined
the term "Sobel Operator" in contrast to the "Roberts Cross"
and used it in his course. Subsequently Pingle published a
paper (1969) describing it as part of his edge follower, and
Duda and Hart mentioned it in their book.
In response to your gentle prod I polished up a note I had
started to write several years ago for publication (I knew
not where), giving my rationale for the derivation. I
enclose it herewith with the request that you submit it as a
technical/historical note with your forthcoming paper.
Sincerely,
Irwin Sobel
HPLABS, Measurement and Manufacturing Research Center
***************************** difop note follows ****************************
An Isotropic 3x3 Image Gradient Operator
by Irwin Sobel
We would like to document the derivation of a simple,
computationally efficient, gradient operator which we
developed in 1968. This operator has been frequently used
and referenced since that time. The earliest description of
this operator in the computer vision literature is [1],
although it has been more widely popularized by its appearance
in [2]. Horn [3] defines this operator and references 4
numerical analysis texts [4-7] with the statement:
"Numerical analysis [4-7] teaches us that for
certain classes of surfaces an even better
estimate is obtained using a weighted average
of three such central differences ... These
expressions produce excellent estimates for the
components of the gradient of the central point".
The motivation to develop it was to get an efficiently
computable gradient estimate which would be more isotropic than
the then popular "Roberts Cross" operator [8]. The principle
involved is that of estimating the gradient of a digitized
picture at a point by the vector summation of the 4 possible
simple central gradient estimates obtainable in a 3x3
neighborhood. The vector summation operation provides
an averaging over directions-of-measurement of the gradient.
If the density function was truly planar over the neighborhood
all 4 gradients would have the same value. Any differences
are deviations from local planarity of the function over
the neighborhood. The intent here was to extract the
direction of the "best" plane although no attempt was
made to make this rigorous.
To be more specific, we will refer here to the image function
as a "density" function. (It could just as well be an
"intensity" function - the difference depends on the physical
nature of the image source.) For a 3x3 neighborhood each
simple central gradient estimate is a vector sum of a pair
of orthogonal vectors. Each orthogonal vector is a directional
derivative estimate multiplied by a unit vector specifying
the derivative's direction. The vector sum of these 4 simple
gradient estimates amounts to a vector sum of the 8 directional
derivative vectors.
Thus for a point on a Cartesian grid and its eight neighbors
having density values as shown
____________
| a | b | c |
|___|___|___|
| d | e | f |
|___|___|___|
| g | h | i |
|___|___|___|
we define the magnitude of the directional derivative estimate
vector 'g' for a given neighbor as
|g| = <density difference>/<distance to neighbor>
The direction of 'g' will be given by the unit vector to the
appropriate neighbor. Notice that the neighbors group into
antipodal pairs: (a,i) (b,h) (c,g) (f,d). Vector summing
derivative estimates within each pair causes all the "e"
values to cancel leaving the following vector sum for our
gradient estimate:
G = (c-g)/4 * [ 1, 1]
+(a-i)/4 * [-1, 1]
+(b-h)/2 * [ 0, 1]
+(f-d)/2 * [ 1, 0]
the resultant vector being
G = [(c-g-a+i)/4 + (f-d)/2, (c-g+a-i)/4 + (b-h)/2]
Notice that the square root fortuituously drops out of the
formula. If this were to be metrically correct we should
divide result by 4 to get the average gradient. However,
since these operations are typically done in fixed point
on small integers and division loses low order significant
bits, it is convenient rather to scale the vector by 4,
thereby replacing the "divide by 4" (doubleshift right) with
a "multiply by 4" (doubleshift left) which will preserve low
order bits. This leaves us with an estimate which is 16 times
as large as the average gradient. The resultant formula is:
G' = 4*G = [c-g-a+i + 2*(f-d), c-g+a-i + 2*(b-h)]
It is useful to express this as weighted density summations
using the following weighting functions for x and y components:
________________ ________________
| -1 | 0 | 1 | | 1 | 2 | 1 |
|____|____|____| |____|____|____|
| -2 | 0 | 2 | | 0 | 0 | 0 |
|____|____|____| |____|____|____|
| -1 | 0 | 1 | | -1 | -2 | -1 |
|____|____|____| |____|____|____|
x-component y-component
This algorithm was used as an edgepoint detector in the 1968
vision system [2] at the Stanford Artificial Intelligence
Laboratory wherein a point was considered an edgepoint if
and only if
|G'|**2 > T
where T was a previously chosen threshold. For this purpose
it proved an economical alternative to the more robust, but
computationally expensive "Hueckel operator" [9].
References:
[1] Pingle, K.K., "Visual Perception by a Computer", in
Automatic Interpretation and Classification of Images,
A. Grasselli (Ed.), Academic Press, New York, 1969, pp.
277-284.
[2] Duda, R.O. and Hart, P.E., pp. 271-272 in Pattern
Classification and Scene Analysis, John Wiley & Sons,
New York, 1973
[3] Horn,B.K.P., "Hill Shading and the Reflectance Map",
Originally in Proc. DARPA Workshop in Image Understanding,
Apr 24-25,1979, p. 85 Science Applications Inc.
Report SAI-80-895-WA; Later in Geo- Processing 2(1982)
p. 74, Elsevier Scientific Publishing, Amsterdam.
[4] Conte,D. and de Boor, C., Elementary Numerical Analysis,
1972, New York: McGraw Hill.
[5] Hamming, R.W., Numerical Methods for Scientists and
Engineers, 1962, New York: McGraw Hill.
[6] Richtmeyer, R.D. and Morton, K.W., Difference Methods
for Initial-Value Problems, New York: John Wiley
pp.136-143.
[7] Hildebrand, F.B., Introduction to Numerical Analysis,
1956, 1974 New York: McGraw Hill.
[8] Roberts, L. G., "Machine Perception of Three-Dimensional
Solids," in Optical and Electro-Optical Information
Processing, pp. 159-197, J. T. Tippett, et al., (Ed.'s),
MIT Press, Cambridge, Mass., 1965.
[9] Hueckel, M.H., "An Operator which Locates Edges in Digitized
Pictures" in Journal of the Association of Computing
Machinery, Vol.18,No. 1, January 1971, pp. 113-125.
******************************************************************************
******************************* end of correspondence ***********************
******************************************************************************
... Edge sharpness is a critical measure of the quality of reconstructed CT data and refers to the clarity and definition of edges between regions of differing attenuation coefficients in the reconstructed image. The Sobel and Scharr operators are gradient-based edge detection methods [13,14]. Sobel uses 3×3 kernels to estimate horizontal and vertical gradients, balancing precision and noise robustness. ...
Article
Full-text available
Conventional industrial computed tomography (CT) systems are constrained in their choice of acquisition trajectories due to their mechanical design. These systems are very precise instruments since they do only move on primarily highly accurate rotational stages. In order to be able to scan an arbitrary Region of Interest (ROI), regardless of the position, size and weight of the specimen, conventional industrial robots can be used as flexible 6 degrees of freedom (DOF) manipulators. For example in a twin robot computed tomography system, acquisition geometries with arbitrary tool poses can be realized. In scientific applications, the quality of the CT volume image is of primary interest, whereas in an industrial environment it is often a matter of balancing quality and acquisition time. Common industrial robots cannot achieve the required positional accuracy without calibration to generate an ideal reconstruction. In the presented study, methods for the geometric correction of CT scans are compared. Image based correction is compared to general machine calibration and full pose tracking by laser trackers. Image quality metrics such as the Modulation Transfer Function, Shannon entropy and Tenengrad variance are utilized to evaluate and compare the reconstruction quality of the various correction and calibration approaches. The assessment of the reconstruction quality revealed a comparable reconstruction quality between the approaches, with the machine calibration approach emerging as one of the best, while also reducing the time-intensive correction overhead.
... The goal of these techniques is to accurately identify and locate the edge features of objects in an image. Traditional edge detection methods often rely on classic image processing algorithms, such as the Sobel operator and the Canny edge detector [1,2]. These methods perform well in simple scenes but tend to struggle in more complex environments. ...
Article
Full-text available
With the rapid development of metasurface technology, metasurfaces have gained significant attention in optical edge detection. Owing to their precise control over the phase, amplitude, and polarization state of electromagnetic waves, metasurfaces offer a novel approach to edge detection that not only overcomes the size limitations of traditional optical devices but also significantly enhances the flexibility and efficiency of image processing. This paper reviews recent research advances in metasurfaces for optical edge detection. Firstly, the principles of phase-controlled metasurfaces in edge detection are discussed, along with an analysis of their features in different applications. Then, methods for edge detection based on polarization and dispersion modulation of metasurfaces are elaborated, highlighting the potential of these technologies for efficient image processing. In addition, the progress in multifunctional metasurfaces is presented, offering new perspectives and application prospects for future optical edge detection, along with a discussion on the limitations of metasurface-based edge detection technologies and an outlook on their future development.
... The functions that they accelerate are diverse and the implementations also vary among different HWAs. As an example, we will first describe a typical implementation of a 3x3 horizontal Sobel filter accelerator [40] (shown in Figure 2), which computes the gradient of an image for image recognition. The accelerator executes the Sobel filter on a target VGA (640x480) image, at a target frame rate of 30 frames per second (fps). ...
Preprint
Modern SoCs integrate multiple CPU cores and Hardware Accelerators (HWAs) that share the same main memory system, causing interference among memory requests from different agents. The result of this interference, if not controlled well, is missed deadlines for HWAs and low CPU performance. State-of-the-art mechanisms designed for CPU-GPU systems strive to meet a target frame rate for GPUs by prioritizing the GPU close to the time when it has to complete a frame. We observe two major problems when such an approach is adapted to a heterogeneous CPU-HWA system. First, HWAs miss deadlines because they are prioritized only close to their deadlines. Second, such an approach does not consider the diverse memory access characteristics of different applications running on CPUs and HWAs, leading to low performance for latency-sensitive CPU applications and deadline misses for some HWAs, including GPUs. In this paper, we propose a Simple Quality of service Aware memory Scheduler for Heterogeneous systems (SQUASH), that overcomes these problems using three key ideas, with the goal of meeting deadlines of HWAs while providing high CPU performance. First, SQUASH prioritizes a HWA when it is not on track to meet its deadline any time during a deadline period. Second, SQUASH prioritizes HWAs over memory-intensive CPU applications based on the observation that the performance of memory-intensive applications is not sensitive to memory latency. Third, SQUASH treats short-deadline HWAs differently as they are more likely to miss their deadlines and schedules their requests based on worst-case memory access time estimates. Extensive evaluations across a wide variety of different workloads and systems show that SQUASH achieves significantly better CPU performance than the best previous scheduler while always meeting the deadlines for all HWAs, including GPUs, thereby largely improving frame rates.
... The field has progressed extensively, and methods have quickly moved towards deep learning approaches from classical morphological processing. The classical approach [22,23] required investment in hand-crafted functions to understand gradients in an image that may predate a boundary. These approaches are prone to false positives, leading to excess predictions of boundaries. ...
Conference Paper
Full-text available
Visual navigation cues can solve navigation under circumstances where GPS is unavailable. One such important cue is the visible horizon line. The visible horizon line is constructed from features such as the intersection between surrounding land masses and the waterline. Predicting the visible horizon line is a sub-task of semantic boundary prediction. Predicting the boundary of semantically distinct objects in images has implications for many downstream tasks, such as semantic and instance segmentation, object detection and tracking. Boundary prediction and downstream tasks have broad applications in autonomous driving, medical imaging and remote sensing. When isolating semantic objects in a scene, a deep learning model is optimal if the prediction of the boundaries is accurate and precise. Therefore, semantic boundary prediction can be considered a semantic segmentation refinement task. We propose to tackle semantic boundary prediction with a framework that primarily looks at deep supervision for feature conditioning and differential edge detection to incorporate strong priors. We also use semantic segmentation as an intermediate task to act as a strong implicit supervision signal to the semantic boundaries. We conduct experiments on public datasets and our dataset to evaluate the model in various scenarios compared to current methods. Our results show that our framework predicts high-quality boundaries over various datasets and domains with the ability to perform in few-shot scenarios.
Article
Face recognition technology attracts great attention in many technological areas. The development of face recognition algorithms has made significant contributions to the elimination of deficiencies in the field of image processing. Especially image processing libraries such as OpenCV provide a reliable and regularly updated platform for researchers and developers. OpenCv, which includes face recognition algorithms, is an image processing library that facilitates image processing. Some people may not want their faces to be seen in videos, movies or live broadcasts, and objectionable images and harmful products such as cigarettes and alcohol may need to be censored. In this case, the Gaussian filter comes to our rescue. The Gaussian filter is a filter widely used in image processing techniques and known for its blurring feature. The Gaussian filter is also called blurring in image processing software. The Python language is a programming language that can work independently of the platform. The Python language contains many libraries and is easy to program. The OpenCv library, like many other libraries, has generally been used with the Python language because it works very well with the Python language and is easily programmed. Many projects developed with Python language and OpenCv can be seen in academic sources. The aim of this study is to perform face recognition using OpenCV library and automatically apply Gaussian filter to recognized faces. All existing software does not automatically blur the desired faces. Doing this process manually is both time-consuming and jeopardizes the protection of privacy due to the unnoticed parts of the manual application process. Possible users of this project include televisions, production companies, broadcasters and YouTubers. This project can contribute to more effective protection of privacy and save time. This article can provide a method for researchers, industry experts and academics.
Article
Full-text available
This paper presents a method for selecting the optimal foot placement for a hexapod robot to facilitate travel over uneven terrain. This is achieved by using an RGB-D (Red Green Blue - Depth) camera to generate a heightmap in real time as the robot walks. The heightmap is then analysed using a convolution matrix to identify areas suitable for foot placement, where the slope of terrain, proximity to steep edges, and distance from the robot are considered. Based on these factors, the foot placement positions for the hexapod are optimised for each step.
Article
The quantification of pebble shape has been of interest to geomorphologists for decades. Several authors developed parameters to describe pebble shapes from their images. The extraction of this information from images involves two steps: the segmentation of pebble contours and the application of a computational geometry algorithm to estimate shape parameters. When images are taken in the field, unavoidable shadows might hinder the possibility of using automatic segmentation methods. This paper introduces a new method for automatic segmentation of pebbles that improves segmentation accuracy in the presence of shadows. The method is based on the Canny edge detection algorithm which uses a double thresholding process to provide a classification of the strength of the detected edges. The proposed method applies this algorithm with an ensemble of thresholding values, estimating, for each pixel, the probability of being an edge. The resulting pebble contours were analysed using two computational geometry algorithms to obtain shape parameters. The algorithm was calibrated on a sample of five pebbles and then validated on a sample of 1696 pebbles. Its accuracy has been estimated by comparing the resulting shape parameters with those obtained using reference software, which was used as ground truth (GT). The proposed segmentation method was capable of accurately segmenting around 91% of the sample with a relative error for roundness of −1.7% and −0.4%; for elongation of −0.2% and −0.3% and for circularity of 0.2% and 0.1%, when shape parameters were computed using the algorithms of Zheng or Roussillon, respectively. The method could therefore be used to segment images of pebbles collected in the field with low contrast and shadowing, providing comparable accuracy with ‘manual’ segmentation, while removing operator bias.
Book
Full-text available
by Lawrence Gilman Roberts.
Article
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
Article
A set of requirements which should be met by a local edge recognizer is formulated. Their main concerns are fast and reliable recognition in the presence of noise. A unique optimal solution for an edge operator results. The operator obtains the best fit of an ideal edge element to any empirically obtained edge element. Proof of this is given. A reliability assessment accompanies every recognition process.