Conference PaperPDF Available

Mixed handwritten and printed digit recognition in Sudoku with Convolutional Deep Belief Network

Authors:
A preview of the PDF is not available
... In 2015, Kamal et al. made a comparative analysis paper on sudoku image processing and solve the puzzle by using backtracking, genetic algorithm, etc., they had used camera-based OCR technique (Kamal et al. 2015). In that 2015, Baptiste Witch and jean hennebert proposed a work based on handwriting and printed digit recognition using convolution deep belief network (Wicht and Henneberty 2015), which is the extension work of the same author on deep belief network. It is handy for detecting grid with cell number (Wicht and Hennebert 2014). ...
Book
This book targets an audience with a basic understanding of deep learning, its architectures, and its application in the multimedia domain. Background in machine learning is helpful in exploring various aspects of deep learning. Deep learning models have a major impact on multimedia research and raised the performance bar substantially in many of the standard evaluations. Moreover, new multi-modal challenges are tackled, which older systems would not have been able to handle. However, it is very difficult to comprehend, let alone guide, the process of learning in deep neural networks, there is an air of uncertainty about exactly what and how these networks learn. By the end of the book, the readers will have an understanding of different deep learning approaches, models, pre-trained models, and familiarity with the implementation of various deep learning algorithms using various frameworks and libraries.
... In 2015, Kamal et al. made a comparative analysis paper on sudoku image processing and solve the puzzle by using backtracking, genetic algorithm, etc., they had used camera-based OCR technique (Kamal et al. 2015). In that 2015, Baptiste Witch and jean hennebert proposed a work based on handwriting and printed digit recognition using convolution deep belief network (Wicht and Henneberty 2015), which is the extension work of the same author on deep belief network. It is handy for detecting grid with cell number (Wicht and Hennebert 2014). ...
Chapter
Intelligent vehicle system (IVS) is being designed to leverage the safety, facility, and life style of society. At the same time, it aims to enhance the driving behavior to minimize the traffic-related issues. Artificial intelligence is assisting such autonomous system, which is now not restricted only to software data, but its functionality is being utilized in decision making in various phases of the IVS in dynamic road environments. One such phase lane detection plays a significant role in IVS especially through various sensors. Here, vision-based sensor mechanism is employed which detects lane marking scheme on structured road. For this purpose, traditional image processing technique has been applied to keep the computation less complex, and public datasets KITTI is utilized. The proposed scheme is effectively identifies various lane markings on the road in the normal driving conditions.
... this work only limited dataset is classified with normal classification. Wicht et.al, [8]: proposed an method for recognizing sudoku puzzles which contains both printed as well as handwritten, where the images been found using various image processing techniques which also entail Hough transform and contour detection, to reap greater level lineaments from raw pixels they have used convolutional deep belief network. The system has tested with the dataset of 200 Sudoku images. ...
Article
In proposed work classification of Malayalam handwritten characters using 80 class labels with 1000 instances for each class. Realization of recognition accuracies in handwritten text is an challenging and never exhausting research problem. The factor"s which pose challenges in handwritten character recognition includes high degree of variability in writing especially in Malayalam handwritten script, type of script and document type are complex and curved nature. For classification a modified CNN architecture is proposed for which an accuracy of 99.55% is achieved.
... this work only limited dataset is classified with normal classification. Wicht et.al, [8]: proposed an method for recognizing sudoku puzzles which contains both printed as well as handwritten, where the images been found using various image processing techniques which also entail Hough transform and contour detection, to reap greater level lineaments from raw pixels they have used convolutional deep belief network. The system has tested with the dataset of 200 Sudoku images. ...
Article
In proposed work classification of Malayalam handwritten characters using 80 class labels with 1000 instances for each class. Realization of recognition accuracies in handwritten text is an challenging and never exhausting research problem. The factor"s which pose challenges in handwritten character recognition includes high degree of variability in writing especially in Malayalam handwritten script, type of script and document type are complex and curved nature. For classification a modified CNN architecture is proposed for which an accuracy of 99.55% is achieved.
... On the off chance that the centroid exists in the x and y directions of a square, it takes the estimation of the line and section number of that square. At that point the numbers present in the picture are perceived and arranged utilizing OCR [10] [11]. The OCR yields exceedingly precise outcomes under the condition that the clamor present around the characters in the picture is insignificant. ...
... Specifically, this idea provides a better approach to (pre)train each layer in turn, initially using a local unsupervised criterion [36] with the aim of learning to produce useful higher-level representations from lower-level-representation output of the previous layer, which leads to much better solutions in terms of generalization performance. Due to such characteristics, DBNs and SDAs were successfully implemented in many nonlinear systems like dimensionality reduction [37][38][39], time-series forecasting [40][41][42], acoustic modeling [43][44][45], and digit recognition [46][47][48]. Therefore, we think the above-mentioned algorithms also have the potential to be applied in urban-sprawl simulations. ...
Article
Full-text available
An effective simulation of the urban sprawl in an urban agglomeration is conducive to making regional policies. Previous studies verified the effectiveness of the cellular-automata (CA) model in simulating urban sprawl, and emphasized that the definition of transition rules is the key to the construction of the CA model. However, existing simulation models based on CA are limited in defining complex transition rules. The aim of this study was to investigate the capability of two unsupervised deep-learning algorithms (deep-belief networks, DBN) and stacked denoising autoencoders (SDA) to define transition rules in order to obtain more accurate simulated results. Choosing the Beijing–Tianjin–Tangshan urban agglomeration as the study area, two proposed models (DBN–CA and SDA–CA) were implemented in this area for simulating its urban sprawl during 2000–2010. Additionally, two traditional machine-learning-based CA models were built for comparative experiments. The implementation results demonstrated that integrating CA with unsupervised deep-learning algorithms is more suitable and accurate than traditional machine-learning algorithms on both the cell level and pattern level. Meanwhile, compared with the DBN–CA, the SDA–CA model had better accuracy in both aspects. Therefore, the unsupervised deep-learning-based CA model, especially SDA–CA, is a novel approach for simulating urban sprawl and also potentially for other complex geographical phenomena.
... Deep Learning Library (DLL) is a Machine Learning library originally focused on RBM and CRBM support. It was developed and used in the context of several research work [29][30][31][32]. It also has support for various neural network layers and backpropagation techniques. ...
Chapter
Full-text available
Deep Learning Library (DLL) is a library for machine learning with deep neural networks that focuses on speed. It supports feed-forward neural networks such as fully-connected Artificial Neural Networks (ANNs) and Convolutional Neural Networks (CNNs). Our main motivation for this work was to propose and evaluate novel software engineering strategies with potential to accelerate runtime for training and inference‘. Such strategies are mostly independent of the underlying deep learning algorithms. On three different datasets and for four different neural network models, we compared DLL to five popular deep learning libraries. Experimentally, it is shown that the proposed library is systematically and significantly faster on CPU and GPU. In terms of classification performance, similar accuracies as the other libraries are reported.
Chapter
Full-text available
The motivation behind the paper is to give a single shot solution of sudoku puzzle by using computer vision. This study’s purpose is twofold. First to recognise the puzzle by using deep belief network which is very useful to extract the high-level feature, and the second objective is to solve the puzzle by using parallel rule-based technique and efficient ant colony optimization method. Each of the two methods can solve this NP-complete puzzle. But singularly they lack effeciency, so we serialised these two techniques to resolve any puzzle efficiently with less time and number of iteration.
Chapter
Many people try solving Sudoku puzzles every day. These puzzles are usually found in newspapers, magazines and so on. Whenever a person is unable to solve a puzzle or is running short on time to solve the puzzle, it will be very convenient to show the solved puzzle as an augmented reality. Objectives: In this paper, proposed an optimal way of recognizing a Sudoku puzzle using computer vision and Deep Learning, and solve the puzzle using constraint programming and backtracking algorithm to display the solved puzzle as augmented reality. Also, a comparative performance analysis with the previous work is shown and provided at the end of this paper. Methods: In order to implement augmented reality on to the Sudoku puzzle, image classification itself won’t be sufficient as the solved puzzle has to be shown on top of the area of the unsolved puzzle in the original image. So puzzle detection has to be performed and for doing so proposed work used CNN and Object Localization algorithms. After the detection this should store the values detected in each 9 × 9 cells and ran a constraint programming and backtracking algorithm to solve the puzzle and finally filled the detected empty cells with correct values of the solved puzzle. Applications/Improvements: Usually the Sudoku puzzles that will find in newspapers and magazines are surrounded by a lot of noise such as text (characters) irrelevant to the puzzle and borders of the newspaper which could be similar to a Sudoku puzzle structure. In this paper it emphasize on how to handle such disturbances and improve the performance.
Book
This book highlights recent advances in Cybernetics, Machine Learning and Cognitive Science applied to Communications Engineering and Technologies, and presents high-quality research conducted by experts in this area. It provides a valuable reference guide for students, researchers and industry practitioners who want to keep abreast of the latest developments in this dynamic, exciting and interesting research field of communication engineering, driven by next-generation IT-enabled techniques. The book will also benefit practitioners whose work involves the development of communication systems using advanced cybernetics, data processing, swarm intelligence and cyber-physical systems; applied mathematicians; and developers of embedded and real-time systems. Moreover, it shares insights into applying concepts from Machine Learning, Cognitive Science, Cybernetics and other areas of artificial intelligence to wireless and mobile systems, control systems and biomedical engineering.
Article
Full-text available
In this paper, we propose a method to detect and recognize a Sudoku puzzle on images taken from a mobile camera. The lines of the grid are detected with a Hough transform. The grid is then recomposed from the lines. The digits position are extracted from the grid and finally, each character is recognized using a Deep Belief Network (DBN). To test our implementation, we collected and made public a dataset of Sudoku images coming from cell phones. Our method proved successful on our dataset, achieving 87.5% of correct detection on the testing set. Only 0.37% of the cells were incorrectly guessed. The algorithm is capable of handling some alterations of the images, often present on phone-based images, such as distortion, perspective, shadows, illumination gradients or scaling. On average, our solution is able to produce a result from a Sudoku in less than 100ms.
Article
We describe how to train a two-layer convolutional Deep Belief Network (DBN) on the 1.6 million tiny images dataset. When training a convolutional DBN, one must decide what to do with the edge pixels of teh images. As the pixels near the edge of an image contribute to the fewest convolutional lter outputs, the model may
Conference Paper
In this paper we propose a method of detecting and recognizing the elements of a Sudoku Puzzle and providing a digital copy of the solution for it using MATLAB. The method involves a vision-based sudoku solver. The solver is capable of solving a sudoku directly from an image captured from any digital camera. After applying appropriate pre-processing to the acquired image we use efficient area calculation techniques to recognize the enclosing box of the puzzle. A virtual grid is then created to identify the digit positions. Template matching is used as a method for digit recognition. The actual solution is computed using a backtracking algorithm. Experiments conducted on various types of sudoku questions demonstrate the efficiency and robustness of our proposed approaches in real-world scenarios. The algorithm is found to be capable of handling cases of translation, perspective, illumination gradient, scaling, and background clutter.
Article
In order to highlight the interesting problems and actual results on the state of the art in optical character recognition (OCR), this paper describes and compares preprocessing, feature extraction and postprocessing techniques for commercial reading machines. Problems related to handwritten and printed character recognition are pointed out, and the functions and operations of the major components of an OCR system are described. Historical background on the development of character recognition is briefly given and the working of an optical scanner is explained. The specifications of several recognition systems that are commercially available are reported and compared.
Article
LIBSVM is a library for support vector machines (SVM). Its goal is to help users to easily use SVM as a tool. In this document, we present all its imple-mentation details. For the use of LIBSVM, the README file included in the package and the LIBSVM FAQ provide the information.
Article
The increasing availability of high-performance, low-priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning for document image acquisition. Digital cameras attached to cellular phones, PDAs, or wearable computers, and standalone image or video devices are highly mobile and easy to use; they can capture images of thick books, historical manuscripts too fragile to touch, and text in scenes, making them much more versatile than desktop scanners. Should robust solutions to the analysis of documents captured with such devices become available, there will clearly be a demand in many domains. Traditional scanner-based document analysis techniques provide us with a good reference and starting point, but they cannot be used directly on camera-captured images. Camera-captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background. In this paper we present a survey of application domains, technical challenges, and solutions for the analysis of documents captured by digital cameras. We begin by describing typical imaging devices and the imaging process. We discuss document analysis from a single camera-captured image as well as multiple frames and highlight some sample applications under development and feasible ideas for future development.