Conference PaperPDF Available

Ballot mark detection

Authors:

Abstract and Figures

Optical mark sensing, i.e., detecting whether a ldquobubblerdquo has been filled in, may seem straightforward. However, on US election ballots the shape, intensity, size and position of the marks, while specified, are highly variable due to a diverse electorate. The ballots may be produced and scanned by poorly maintained equipment. Yet near-perfect results are required. To improve the current technology, which has been subject to criticism, components of a process for identifying marks on an optical sense ballot are evaluated. When marked synthetic ballots are compared to an unmarked ballot, the absolute difference of adaptive thresholded images gives best detection rates for all darknesses of marks, but at a false alarm rate increase. Simple absolute differencing can give good detection results with lower false alarm rates.
Content may be subject to copyright.
A preview of the PDF is not available
... In [1] five image preprocessing methods were explored. Those methods are summarized inTable 2. The first four consist of various combinations of absolute difference between the target and the registered reference image, with or without smoothing before or after the difference. ...
... Exploration of ballot mark detection was previously done on synthetic ballots that had not been printed or scanned so they were noise-free [1]. The differencing methods summarized inTable 1 showed their potential given a range of mark sizes, shades and shape, as well as looking at the effect of when the mark is in the clear or when it overlaps some preprinted feature of the ballot. ...
... In [1] it was found that when thresholding the difference images to separate marks from non-marks higher thresholds will detect fewer marks and lower thresholds detect more marks. It was found that the adaptive threshold was necessary to detect the lightest of the marks, at a relatively high threshold level, but that it introduced other artifacts which caused a higher false alarm rate. ...
Conference Paper
Full-text available
Analyzing paper-based election ballots requires finding all marks added to the base ballot. The position, size, shape, rotation and shade of these marks are not known a priori. Scanned ballot images have additional differences from the base ballot due to scanner noise. Different image processing techniques are evaluated to see under what conditions they are able to detect what sorts of marks. Basing mark detection on the difference of raw images was found to be much more sensitive to the mark darkness. Converting the raw images to foreground and background and then removing the form produced better results.
... In the end, the cropped regions are passed to the algorithm that is responsible for extracting information from the OMR document. In the United States, balloting was done with the help of OMR systems during their elections [21]. Another two-phase system was introduced, based on the training phase and the recognition phase. ...
Article
Full-text available
Optical Mark Recognition (OMR) systems have been studied since the 1970s. Due to its simplicity of use and efficiency in bulk operations, OMR technology has been gaining popularity over time. They are used as an automated data input technique for surveys and multiple-choice question papers in educational institutions for automatic evaluation and grading of student inputs. The requirement of the conventional OMR systems comprises specialized OMR machines or optical scanners with automatic document feeding capability. These machines and scanners are fixed-location devices and cannot be moved easily. Their energy requirements are high, while they also require human efforts to operate. These machines are expensive and, hence pose budget constraints for small educational institutions. Due to being mechanical, their maintenance and operating cost is high. To overcome these limitations, alternate devices are smartphone cameras, which though handy adversely lack the capability of scanning documents in a controlled environment. An uncontrolled environment leads to inputs that existing OMR algorithms do not recognize at large, while the accuracy rate and precision stay low to an undesirable extent. Due to this shortcoming, the usage of smartphone cameras is still not feasible. In this experimental study, we have proposed an OMR algorithm specifically for inputs taken from smartphones equipped with decent cameras and running Android or iOS operating systems. Thus effectively, we have ported the OMR technology to smartphones, offering more flexibility, easiness, and mobility of its usage in daily life. The key issue that transpired in our experiments is the bad illumination in different lighting conditions. Our results are very promising and comparable to those obtained from the usage of optical scanners.
... Some issues were discussed [39] that can come up at the stage of scanning of OMR sheets. OMR was used in the Ballot process of elections in United States [40]. OMR sheet evaluation by using two phase system (training phase and recognition phase) along with Modified Multi Connect Architecture is introduced [41]. ...
Article
Optical Mark Recognition (OMR) systems have been studied since 1970. It is widely accepted as a data entry technique. OMR technology is used for surveys and multiple-choice questionnaires. Due to its ease of use, OMR technology has grown in popularity over the past two decades and is widely used in universities and colleges to automatically grade and grade student responses to questionnaires. The accuracy of OMR systems is very important due to the environment in which they are used. The OMR algorithm relies on pixel projection or Hough transform to determine the exact answer in the document. These techniques rely on majority voting to approximate a predetermined shape. The performance of these systems depends on precise input from dedicated hardware. Printing and scanning OMR tables introduces artifacts that make table processing error-prone. This observation is a fundamental limitation of traditional pixel projection and Hough transform techniques. Depending on the type of artifact introduced, accuracy is affected differently. We classified the types of errors and their frequency according to the artifacts in the OMR system. As a major contribution, we propose an improved algorithm that fixes errors due to skewness. Our proposal is based on the Hough transform for improving the accuracy of bias correction mechanisms in OMR documents. As a minor contribution, our proposal also improves the accuracy of detecting markers in OMR documents. The results show an improvement in accuracy over existing algorithms in each of the identified problems. This improvement increases confidence in OMR document processing and increases efficiency when using automated OMR document processing.
... Some issues were discussed [39] that can come up at the stage of scanning of OMR sheets. OMR was used in the Ballot process of elections in United States [40]. OMR sheet evaluation by using two phase system (training phase and recognition phase) along with Modified Multi Connect Architecture is introduced [41]. ...
Article
Full-text available
Optical Mark Recognition (OMR) systems have been studied since 1970. It is widely accepted as a data entry technique. OMR technology is used for surveys and multiple-choice questionnaires. Due to its ease of use, OMR technology has grown in popularity over the past two decades and is widely used in universities and colleges to automatically grade and grade student responses to questionnaires. The accuracy of OMR systems is very important due to the environment in which they are used. The OMR algorithm relies on pixel projection or Hough transform to determine the exact answer in the document. These techniques rely on majority voting to approximate a predetermined shape. The performance of these systems depends on precise input from dedicated hardware. Printing and scanning OMR tables introduces artifacts that make table processing error-prone. This observation is a fundamental limitation of traditional pixel projection and Hough transform techniques. Depending on the type of artifact introduced, accuracy is affected differently. We classified the types of errors and their frequency according to the artifacts in the OMR system. As a major contribution, we propose an improved algorithm that fixes errors due to skewness. Our proposal is based on the Hough transform for improving the accuracy of bias correction mechanisms in OMR documents. As a minor contribution, our proposal also improves the accuracy of detecting markers in OMR documents. The results show an improvement in accuracy over existing algorithms in each of the identified problems. This improvement increases confidence in OMR document processing and increases efficiency when using automated OMR document processing.
... Bir oy pusulası, seçim kimlik bilgilerini (seçim bölgesi, seçim tarihi, oy pusulası numarası, sayfa numarası), talimatları (bir aday seçmek veya oy kullanmak), adayların listesini (adayların isimleri ve parti üyeliğini) ve her oylama için işaretlenecek bir hedef dizisini içermektedir [1]. ...
Article
Full-text available
Bu çalışmada, oy tespitini ve sayımını gerçekleştiren görüntü işleme tabanlı bir sistem geliştirilmiştir. Sistem, donanımsal ve yazılımsal olmak üzere iki bölümden oluşmaktadır. Görüntü almak için kullanılan kamera ve görüntü işlemek için kullanılan Raspberry Pi3 donanım kısmını oluşturmaktadır. Yazılım kısmında ise görüntü işleme yöntemlerinden olan Oriented FAST and Rotated BRIEF (ORB) metodu ile Brute-Force Eşleştirmesi kullanılarak görüntülerdeki öznitelikler eşleştirilmektedir. Yazılımın kodları Python programlama dilinde yazılmış olup ve OpenCV kütüphanesinden faydalanılmıştır. Çalışmada oy tespiti ve sayımı için 72 punto büyüklüğünde, Calibri yazı tipi ile yazılmış EVET ve HAYIR oy pusulaları kullanılmıştır. Sistemde yüksek çözünürlüklü kamera sayesinde oy pusulasının görüntüsü alınmakta ve görüntü işleme yazılımına aktarılmaktadır. Yazılımın çalışma mantığı, kayıtlı görüntüdeki köşe ve dönüm noktaları gibi ayırt edici özelliklerin belirlenmesi ve belirlenen özelliklerin, kamera tarafından çekilen görüntüler ile eşleştirilmesi prensibine dayanmaktadır. Oyun kime verildiğinin tespiti için ise kullanılan oyun yatay düzlemindeki konumuna bakılmaktadır. Sistem eşleşen görüntüye göre evet ve hayır oy sayılarını arttırmaktadır. Yapılan çalışmada %100 başarı oranı ile oy tespiti ve sayımı gerçekleşmiştir.
... This difference will usually have an amount of noise due to small misalignments, dust or different scanning conditions. To obtain a mark detector that is less sensitive to noise, several approaches are discussed in [25,27], like using a distance transform to detect safe and unsafe zones, depending on their distance to black pixels, using Gaussian filters to smooth the images before performing the subtraction or using morphological filters. Some authors try to detect a grid for possible positions of marks by analyzing the geometry of the ballot [26]. ...
Conference Paper
Full-text available
In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
... The PERFECT project at Lehigh University pioneered the application of document analysis techniques to automating the interpretation of voter marks on optical scan ballots [16,14,9], though they do not look at detecting write-ins. ...
Conference Paper
Full-text available
Optical scan ballot systems are widely used in elections today. However, deployed optical scan systems may not always interpret write-in votes correctly. For instance, if a voter writes in a name but forgets to shade in the corresponding voting target, an optical scanner may not detect the write-in, which could lead to a lost vote. In this paper, we study methods for automatic recognition of write-in marks. We then apply these methods to ballots from an election in Leon County, Florida and study the kinds of write-in marks that are seen in practice. Our results from this election show that voters frequently (about 49% of the time) do not fill in the write-in bubble when entering a write-in vote. Consequently, votes may be lost in current voting systems.
Article
Optical scan voting systems are ubiquitous. Unfortu-nately, optical scan technology is vulnerable to failures that can result in miscounted votes and lost confidence. While manual counts may be able to detect these failures, counting all the ballots by hand is in many situations im-practical and prohibitively expensive. In this paper, we present a novel approach for examining a large set of bal-lot images to verify that they were properly interpreted by the opscan system. Our system allows the user to si-multaneously inspect and verify many ballot images at once. In this way, our scheme is significantly more effi-cient than manually recounting or inspecting ballots one at a time, providing the accuracy associated with human inspection at reduced cost. We evaluate our approach on approximately 30,000 ballots cast in the June 2008 Hum-boldt County Primary Election and demonstrate that our approach improves the efficiency of human verification of ballot images by an order of magnitude.
Article
Full-text available
As in other document imaging applications, cameras may replace scanners in op-scan election systems. Current commercial op-scan devices based on optical scanners have intrinsic limitations because they must incorporate a paper-transport or an optical-assembly translation mechanism. Recent improvements in consumer-grade camera technology allow document-size imaging with a spatial sampling rate, point-spread function, and geometric fidelity sufficient for extracting hand-printed marks from ballots. Expected advantages over current election technology include higher reliability, greater flexibility with respect to ballot formats, and lower cost and power consumption.
Conference Paper
Full-text available
As a result of well-publicized security concerns with direct recording electronic (DRE) voting, there is a growing call for systems that employ some form of paper artifact to provide a verifiable physical record of a voter's choices. In this paper, we present a system we are developing to support a multi-institution, cross-disciplinary research project examining issues that arise when paper ballots are used in elections. We survey the motivating factors behind our work, discuss the special constraints raised in processing ballots as opposed to more general document images, and describe the current status of our system.
Conference Paper
A new technique is presented for quickly identifying global affine transformations applied to tabular document images, and to correct for those transformations. This technique, based on the Fourier-Mellin transform, is used to register (align) a set of tabular documents to each other. Each component of the affine transform is handled separately, which dramatically reduces the total parameter space of the problem. This method is robust, and deals with all components of the affine transform in a uniform way. The Fourier-Mellin transform is also extended to handle shear, which can approximate a small amount of perspective distortion, and to not need Blackman windowing for document images. In order to limit registration to foreground pixels only, and to eliminate Fourier "edge effects ", a novel, locally adaptive foreground-background segmentation algorithm is introduced, based on the median filter. An original method is also presented for automatically obtaining blank document templates from a set of registered document images. Finally, image registration is demonstrated as a tool for compression of document images which share the same template.
Article
In many applications of pattern recognition, patterns appear together in groups (fields) that have a common origin. For example, a printed word is usually a field of character patterns printed in the same font. A common origin induces consistency of style in features measured on patterns. The features of patterns co-occurring in a field are statistically dependent because they share the same, albeit unknown, style. Style constrained classifiers achieve higher classification accuracy by modeling such dependence among patterns in a field. Effects of style consistency on the distributions of field-features (concatenation of pattern features) can be modeled by hierarchical mixtures. Each field derives from a mixture of styles, while, within a field, a pattern derives from a class-style conditional mixture of Gaussians. Based on this model, an optimal style constrained classifier processes entire fields of patterns rendered in a consistent but unknown style. In a laboratory experiment, style constrained classification reduced errors on fields of printed digits by nearly 25 percent over singlet classifiers. Longer fields favor our classification method because they furnish more information about the underlying style.
Article
We formalize the notion of style context, which accounts for the increased accuracy of the field classifiers reported in this journal recently. We argue that style context forms the basis of all order-independent field classification schemes. We distinguish between intraclass style, which underlies most adaptive classifiers, and interclass style, which is a manifestation of interpattern dependence between the features of the patterns of a field. We show how style-constrained classifiers can be optimized either for field error (useful for short fields like zip codes) or for singlet error (for long fields, like business letters). We derive bounds on the reduction of error rate with field length and show that the error rate of the optimal style-constrained field classifier converges asymptotically to the error rate of a style-aware Bayesian singlet classifier.
Conference Paper
The widespread use of printed forms for data acquisition makes the ability to automatically read and analyze their contents desirable. The components of a forms analysis system include conversion from paper to an image through scanning, image enhancement, document identification, data extraction, and data interpretation. This paper describes techniques for manipulating form electronic images in preparation for data interpretation. A combination feature extraction/model-based approach is used for forms registration and field extraction. The system is demonstrated on United States Internal Revenue Service forms
Article
This correspondence discusses an extension of the well-known phase correlation technique to cover translation, rotation, and scaling. Fourier scaling properties and Fourier rotational properties are used to find scale and rotational movement. The phase correlation technique determines the translational movement. This method shows excellent robustness against random noise
Article
Recent advances in intelligent character recognition are enabling us to address many challenging problems in document image analysis. One of them is intelligent form analysis. This paper describes a generic system for form dropout when the filled-in characters or symbols are either touching or crossing the form frames. We propose a method to separate these characters from form frames whose locations are unknown. Since some of the character strokes are either touching or crossing the form frames, we need to address the following three issues: 1) localization of form frames; 2) separation of characters and form frames; and 3) reconstruction of broken strokes introduced during separation. The form frame is automatically located by finding long straight lines based on the block adjacency graph. Form frame separation and character reconstruction are implemented by means of this graph. The proposed system includes form structure learning and form dropout. First, a form structure-based template is automatically generated from a blank form which includes form frames, preprinted data areas and skew angle. With this form template, our system can then extract both handwritten and machine-typed filled-in data. Experimental results on three different types of forms show the performance of our system. Further, the proposed method is robust to noise and skew that is introduced during scanning
RPI Synthetic Ballots http
  • Perfect The
  • Project