ArticlePDF Available

An Isotropic 3x3 Image Gradient Operator

  • IS_Consulting (HP Labs Retired - 8Mar13)
History and Definition of the so-called "Sobel Operator",
more appropriately named the
Sobel-Feldman Operator
by Irwin Sobel
February 2, 2014
Updated June 14 2015
The work was never "published" by me... however it was first
described and credited in a footnote "suggested by I. Sobel" in
the book:
- Duda,R. and Hart,P., Pattern Classification and Scene Analysis,
John Wiley and Sons,'73, pp271-2
The "suggestion" was made in a talk:
Sobel, I., Feldman, G., "A 3x3 Isotropic Gradient Operator for Image
Processing", presented at the Stanford Artificial Intelligence Project
(SAIL) in 1968.
There also a detailed historical account that I gave to
Prof. Per-Erik Danielsson
Institutionen for Systemteknik
Department of Electrical Engineering
Universitetet i Linkoping
S-581 83 Linkoping, Sweden
which he kindly published as an appendix to a paper of his:
Danielsson, P.E., Seger, O., "Generalized and Separable Sobel Operators",
in "Machine vision for three-dimensional scenes",
Herbert Freeman (ed), Academic Press (1990).
******************************* correspondence follows ***********************
[corrected by IES 15Dec94 - G' scale factor is 4 not 2]
February 6, 1989
Prof. Per-Erik Danielsson
Institutionen for Systemteknik
Department of Electrical Engineering
Universitetet i Linkoping
S-581 83 Linkoping
Dear Prof. Danielsson,
The history of the "Sobel Operator" according to my best
recollection is as follows:
In 1968, while a PhD candidate at the Stanford Artificial
Intelligence Project I gave a talk, together with Gary Feldman
(another graduate student and good friend of mine) on a
relatively isotropic 3x3 gradient operator. This talk was
presented at a time when the major piece of published work
on computer vision was Larry Roberts' PhD Thesis from MIT
wherein he defined a 2x2 gradient estimator then referred to
as the "Roberts Cross" operator. I had previously thought
up the operator via a line of reasoning presented in the
accompanied short document and discussed it with Gary who
enthusiastically proceeded to help me program it and test
it. After doing so and satisfying ourselves that it gave
at least visually desireable results, we presented these in
a seminar at the Stanford Artificial Intelligence Project
where we were employed as Research Assistants.
As this was an event that occurred more than 20 years ago my
memory of it is somewhat foggy. Faculty that I most clearly
remember in attendance at the seminar were Raj Reddy, John
McCarthy and Arthur Samuels. I'm pretty sure that Peter
Hart and/or Dick Duda and possibly Nils Nilsson from SRI
were also there. Lester Earnest, project executive officer
was most likely there. I'm pretty sure Karl Pingle who was
employed as a project programmer and later wrote an edge
follower incorporating the operator was there. Manfred
Hueckel, a graduate student who later wrote a paper on a
more robust and computationally expensive "edge detector",
was I think also there. Lynn Quam, Jay "Marty" Tenenbaum,
Gunnar Grape, Gil Falk, Dave Poole, and Phil Petit were
other graduate students with the project and either were at
the seminar or were working in such close proximity that
they knew of the results.
My synopsis of what ensued was that Raj Reddy, who was then
teaching one of the first courses on Computer Vision, coined
the term "Sobel Operator" in contrast to the "Roberts Cross"
and used it in his course. Subsequently Pingle published a
paper (1969) describing it as part of his edge follower, and
Duda and Hart mentioned it in their book.
In response to your gentle prod I polished up a note I had
started to write several years ago for publication (I knew
not where), giving my rationale for the derivation. I
enclose it herewith with the request that you submit it as a
technical/historical note with your forthcoming paper.
Irwin Sobel
HPLABS, Measurement and Manufacturing Research Center
***************************** difop note follows ****************************
An Isotropic 3x3 Image Gradient Operator
by Irwin Sobel
We would like to document the derivation of a simple,
computationally efficient, gradient operator which we
developed in 1968. This operator has been frequently used
and referenced since that time. The earliest description of
this operator in the computer vision literature is [1],
although it has been more widely popularized by its appearance
in [2]. Horn [3] defines this operator and references 4
numerical analysis texts [4-7] with the statement:
"Numerical analysis [4-7] teaches us that for
certain classes of surfaces an even better
estimate is obtained using a weighted average
of three such central differences ... These
expressions produce excellent estimates for the
components of the gradient of the central point".
The motivation to develop it was to get an efficiently
computable gradient estimate which would be more isotropic than
the then popular "Roberts Cross" operator [8]. The principle
involved is that of estimating the gradient of a digitized
picture at a point by the vector summation of the 4 possible
simple central gradient estimates obtainable in a 3x3
neighborhood. The vector summation operation provides
an averaging over directions-of-measurement of the gradient.
If the density function was truly planar over the neighborhood
all 4 gradients would have the same value. Any differences
are deviations from local planarity of the function over
the neighborhood. The intent here was to extract the
direction of the "best" plane although no attempt was
made to make this rigorous.
To be more specific, we will refer here to the image function
as a "density" function. (It could just as well be an
"intensity" function - the difference depends on the physical
nature of the image source.) For a 3x3 neighborhood each
simple central gradient estimate is a vector sum of a pair
of orthogonal vectors. Each orthogonal vector is a directional
derivative estimate multiplied by a unit vector specifying
the derivative's direction. The vector sum of these 4 simple
gradient estimates amounts to a vector sum of the 8 directional
derivative vectors.
Thus for a point on a Cartesian grid and its eight neighbors
having density values as shown
| a | b | c |
| d | e | f |
| g | h | i |
we define the magnitude of the directional derivative estimate
vector 'g' for a given neighbor as
|g| = <density difference>/<distance to neighbor>
The direction of 'g' will be given by the unit vector to the
appropriate neighbor. Notice that the neighbors group into
antipodal pairs: (a,i) (b,h) (c,g) (f,d). Vector summing
derivative estimates within each pair causes all the "e"
values to cancel leaving the following vector sum for our
gradient estimate:
G = (c-g)/4 * [ 1, 1]
+(a-i)/4 * [-1, 1]
+(b-h)/2 * [ 0, 1]
+(f-d)/2 * [ 1, 0]
the resultant vector being
G = [(c-g-a+i)/4 + (f-d)/2, (c-g+a-i)/4 + (b-h)/2]
Notice that the square root fortuituously drops out of the
formula. If this were to be metrically correct we should
divide result by 4 to get the average gradient. However,
since these operations are typically done in fixed point
on small integers and division loses low order significant
bits, it is convenient rather to scale the vector by 4,
thereby replacing the "divide by 4" (doubleshift right) with
a "multiply by 4" (doubleshift left) which will preserve low
order bits. This leaves us with an estimate which is 16 times
as large as the average gradient. The resultant formula is:
G' = 4*G = [c-g-a+i + 2*(f-d), c-g+a-i + 2*(b-h)]
It is useful to express this as weighted density summations
using the following weighting functions for x and y components:
________________ ________________
| -1 | 0 | 1 | | 1 | 2 | 1 |
|____|____|____| |____|____|____|
| -2 | 0 | 2 | | 0 | 0 | 0 |
|____|____|____| |____|____|____|
| -1 | 0 | 1 | | -1 | -2 | -1 |
|____|____|____| |____|____|____|
x-component y-component
This algorithm was used as an edgepoint detector in the 1968
vision system [2] at the Stanford Artificial Intelligence
Laboratory wherein a point was considered an edgepoint if
and only if
|G'|**2 > T
where T was a previously chosen threshold. For this purpose
it proved an economical alternative to the more robust, but
computationally expensive "Hueckel operator" [9].
[1] Pingle, K.K., "Visual Perception by a Computer", in
Automatic Interpretation and Classification of Images,
A. Grasselli (Ed.), Academic Press, New York, 1969, pp.
[2] Duda, R.O. and Hart, P.E., pp. 271-272 in Pattern
Classification and Scene Analysis, John Wiley & Sons,
New York, 1973
[3] Horn,B.K.P., "Hill Shading and the Reflectance Map",
Originally in Proc. DARPA Workshop in Image Understanding,
Apr 24-25,1979, p. 85 Science Applications Inc.
Report SAI-80-895-WA; Later in Geo- Processing 2(1982)
p. 74, Elsevier Scientific Publishing, Amsterdam.
[4] Conte,D. and de Boor, C., Elementary Numerical Analysis,
1972, New York: McGraw Hill.
[5] Hamming, R.W., Numerical Methods for Scientists and
Engineers, 1962, New York: McGraw Hill.
[6] Richtmeyer, R.D. and Morton, K.W., Difference Methods
for Initial-Value Problems, New York: John Wiley
[7] Hildebrand, F.B., Introduction to Numerical Analysis,
1956, 1974 New York: McGraw Hill.
[8] Roberts, L. G., "Machine Perception of Three-Dimensional
Solids," in Optical and Electro-Optical Information
Processing, pp. 159-197, J. T. Tippett, et al., (Ed.'s),
MIT Press, Cambridge, Mass., 1965.
[9] Hueckel, M.H., "An Operator which Locates Edges in Digitized
Pictures" in Journal of the Association of Computing
Machinery, Vol.18,No. 1, January 1971, pp. 113-125.
******************************* end of correspondence ***********************
... There are several segmentation techniques, usually grouped into methods based on the similarity between pixels in a region, among which thresholding techniques stand out, and methods based on edge detection [1]. The latter include algorithms based on the use of convolution masks such as the Roberts [2], Prewitt [3], and Sobel-Feldman [4] methods, and edge detectors such as those developed by Canny [5] and Deriche [6]. Notwithstanding their age, the Canny edge detection algorithm and its Deriche variant are still considered state-of-the-art filters and are widely used in diverse applications, particularly in computer vision [7,8] and even combined with neural networks (NNs) [9]. ...
Full-text available
Edge detection is a technique in digital image processing that detects the contours of objects based on changes in brightness. Edges can be used to determine the size, orientation, and properties of the object of interest within an image. There are different techniques employed for edge detection, one of them being phase congruency, a recently developed but still relatively unknown technique due to its mathematical and computational complexity compared to more popular methods. Additionally, it requires the adjustment of a greater number of parameters than traditional techniques. Recently, a unique formulation was proposed for the mathematical description of phase congruency, leading to a better understanding of the technique. This formulation consists of three factors, including a quantification function, which, depending on its characteristics, allows for improved edge detection. However, a detailed study of the characteristics had not been conducted. Therefore, this article proposes the development of a generalized function for quantifying phase congruency, based on the family of functions that, according to a previous study, yielded the best results in edge detection.
Accurate environmental representations play a crucial role in trajectory planning for autonomous driving. In addition to localizing road users, obstacles, and road boundaries, the identification of optical road markings is vital for lane recognition. While optical systems like cameras and lidar sensors are commonly employed for this task, they are susceptible to weather conditions. This paper introduces a novel approach that utilizes automotive chirp-sequence radar sensors exclusively for the detection of conventional road markings. This eliminates the need for additional particles or substances to be applied to the road markings, as demonstrated through an indicative analysis based on real-world measurements in public traffic.
Full-text available
Abdominal aortic aneurysm patients are regularly monitored to assess aneurysm development and risk of rupture. A preventive surgical procedure is recommended when the maximum aortic antero-posterior diameter, periodically assessed on two-dimensional abdominal ultrasound scans, reaches 5.5 mm. Although the maximum diameter criterion has limited ability to predict aneurysm rupture, no clinically relevant tool that could complement the current guidelines has emerged so far. In vivo cyclic strains in the aneurysm wall are related to the wall response to blood pressure pulse, and therefore, they can be linked to wall mechanical properties, which in turn contribute to determining the risk of rupture. This work aimed to enable biomechanical estimations in the aneurysm wall by providing a fast and semi-automatic method to post-process dynamic clinical ultrasound sequences and by mapping the cross-sectional strains on the B-mode image. Specifically, the Sparse Demons algorithm was employed to track the wall motion throughout multiple cardiac cycles. Then, the cyclic strains were mapped by means of radial basis function interpolation and differentiation. We applied our method to two-dimensional sequences from eight patients. The automatic part of the analysis took under 1.5 min per cardiac cycle. The tracking method was validated against simulated ultrasound sequences, and a maximum root mean square error of 0.22 mm was found. The strain was calculated both with our method and with the established finite-element method, and a very good agreement was found, with mean differences of one order of magnitude smaller than the image spatial resolution. Most patients exhibited a strain pattern that suggests interaction with the spine. To conclude, our method is a promising tool for investigating abdominal aortic aneurysm wall biomechanics as it can provide a fast and accurate measurement of the cyclic wall strains from clinical ultrasound sequences.
Full-text available
Due to its mechanical, thermal, and chemical effects, hydrodynamic cavitation is characterized as a process with high potential for use in the wastewater treatment. Despite numerous scientific publications, the technology is still in its infancy, which we attribute to the lack of methods for efficient generation, which is the subject of a doctoral thesis. In the thesis, a new concept of a rotary generator of hydrodynamic cavitation with blunt cavitation elements is designed and the development at laboratory and pilot scale is presented. Hidrodinamska kavitacija je zaradi mehanskih, termičnih in kemijskih učinkov označena kot metoda z visokim potencialom uporabe v procesu čiščenja odpadne vode. Kljub velikemu številu znanstvenih objav je tehnologija še v povojih, kar pripisujemo pomanjkanju načinov učinkovite generacije, ki je predmet doktorskega dela. V delu je zasnovan nov koncept rotacijskega generatorja hidrodinamske kavitacije s topimi kavitacijskimi elementi ter predstavljen razvoj na laboratorijski in pilotni ravni.
Full-text available
by Lawrence Gilman Roberts.
Scitation is the online home of leading journals and conference proceedings from AIP Publishing and AIP Member Societies
A set of requirements which should be met by a local edge recognizer is formulated. Their main concerns are fast and reliable recognition in the presence of noise. A unique optimal solution for an edge operator results. The operator obtains the best fit of an ideal edge element to any empirically obtained edge element. Proof of this is given. A reliability assessment accompanies every recognition process.