Article

Automation of complex text СAPTCHA recognition using conditional generative adversarial networks

Authors:
  • St. Petersburg Federal Research Center of the Russian Academy of Sciences
To read the full-text of this research, you can request a copy directly from the authors.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Chapter
Full-text available
The problem solved in this article is to predict the expression of personality traits of a user, which can be obtained from the Life Style Index questionnaire, through the analysis of the graphical content published in his social media account. The proposed approach is to identify faces in photos from users’ accounts and use them to assess the expression of types of psychological defense. The proposed solution aims at testing the hypothesis that the presence of a face in the images and its position can contain the features sought. The essence of the proposed method consists in transfer training of a related model of a deep neural network for determination of emotions on pairs of faces contained in a digital image, and expressiveness of indicators of psychological defense mechanics. The accuracy of the predictions obtained with the new model when compared to the baseline is more than 2 times higher. Theoretical and practical significance lies in the fact that a new approach is formed, different from the known by the data involved in the analysis, and a neural network model is built, which allows to estimate the severity of psychological defenses of the user on the images published by him in social media, which indirectly in the future will allow to build estimates of the user protection from social engineering attacks.KeywordsArtificial intelligenceMachine learningTransfer learningImage processingSocial engineering attacksInformation securityPersonality traitsSocial media
Article
Full-text available
Deep Residual Networks have recently been shown to significantly improve the performance of neural networks trained on ImageNet, with results beating all previous methods on this dataset by large margins in the image classification task. However, the meaning of these impressive numbers and their implications for future research are not fully understood yet. In this survey, we will try to explain what Deep Residual Networks are, how they achieve their excellent results, and why their successful implementation in practice represents a significant advance over existing techniques. We also discuss some open questions related to residual learning as well as possible applications of Deep Residual Networks beyond ImageNet. Finally, we discuss some issues that still need to be resolved before deep residual learning can be applied on more complex problems.
Article
Full-text available
A Completely Automated Public Turing Test to tell Computers and Humans Apart (CAPTCHA) is used in web systems to secure authentication purposes; it may break using Optical Character Recognition (OCR) type methods. CAPTCHA breakers make web systems highly insecure. However, several techniques to break CAPTCHA suggest CAPTCHA designers about their designed CAPTCHA’s need improvement to prevent computer vision-based malicious attacks. This research primarily used deep learning methods to break state-of-the-art CAPTCHA codes; however, the validation scheme and conventional Convolutional Neural Network (CNN) design still need more confident validation and multi-aspect covering feature schemes. Several public datasets are available of text-based CAPTCHa, including Kaggle and other dataset repositories where self-generation of CAPTCHA datasets are available. The previous studies are dataset-specific only and cannot perform well on other CAPTCHA’s. Therefore, the proposed study uses two publicly available datasets of 4- and 5-character text-based CAPTCHA images to propose a CAPTCHA solver. Furthermore, the proposed study used a skip-connection-based CNN model to solve a CAPTCHA. The proposed research employed 5-folds on data that delivers 10 different CNN models on two datasets with promising results compared to the other studies.
Article
Full-text available
The CAPTCHA technology can be used to ensure big multimedia data security, which includes CAPTCHA design and CAPTCHA recognition. For the existing methods are difficult to achieve high breaking accuracy for complex handwritten text CAPTCHA, a handwritten CAPTCHA recognizer is proposed, which is a text CAPTCHA breaking method based on style transfer network. Firstly, different from the traditional viewpoints that font structure and font style of characters are inseparable in this field, a new idea of separating font structure and font style of characters is proposed, and it is pointed out that character recognition mainly depends on font structure rather than font style. Secondly, based on this idea, a style transfer network for text CAPTCHA is constructed to convert complex and variable handwritten CAPTCHA into easy-to-recognize printed CAPTCHA. Finally, based on deep convolutional neural network, a text CAPTCHA recognition network is constructed to identify the converted printed CAPTCHAs. With CAPTCHAs from three real websites: eBay, Google and reCAPTCHA, experimental results show that the recognizer has higher breaking accuracy for handwritten CAPTCHA compared with the methods proposed in NDSS’16, CCS’18 and “Science” in 2017.
Article
Full-text available
This paper suggests improving the previously existing method of identifying user profiles in different online social networks by adding face recognition results to the model. It is assumed that the method will become more stable for identifying people with the same name, city and age. It will help to find more user profiles in different online social networks, which will improve the estimation of their personal characteristics. Evaluating user personality traits is one of the key tasks in protecting employees of enterprises and companies from social engineering attacks.
Article
Full-text available
Manual data annotation is a time consuming activity. A novel strategy for automatic training of the CAPTCHA breaking system with no manual dataset creation is presented in this paper. We demonstrate the feasibility of the attack against a text-based CAPTCHA scheme utilizing similar network infrastructure used for Denial of Service attacks. The main goal of our research is to present a possible vulnerability in CAPTCHA systems when combining the brute-force attack with transfer learning. The classification step utilizes a simple convolutional neural network with 15 layers. Training stage uses automatically prepared dataset created without any human intervention and transfer learning for fine-tuning the deep neural network classifier. The designed system for breaking text-based CAPTCHAs achieved 80% classification accuracy after 6 fine-tuning steps for a 5 digit text-based CAPTCHA system. The results presented in this paper suggest, that even the simple attack with a large number of attacking computers can be an effective alternative to current CAPTCHA breaking systems.
Article
Full-text available
In order to distinguish between computers and humans, CAPTCHA is widely used in links such as website login and registration. The traditional CAPTCHA recognition method has poor recognition ability and robustness to different types of verification codes. For this reason, the paper proposes a CAPTCHA recognition method based on convolutional neural network with focal loss function. This method improves the traditional VGG network structure and introduces the focal loss function to generate a new CAPTCHA recognition model. First, we perform preprocessing such as grayscale, binarization, denoising, segmentation, and annotation and then use the Keras library to build a simple neural network model. In addition, we build a terminal end-to-end neural network model for recognition for complex CAPTCHA with high adhesion and more interference pixel. By testing the CNKI CAPTCHA, Zhengfang CAPTCHA, and randomly generated CAPTCHA, the experimental results show that the proposed method has a better recognition effect and robustness for three different datasets, and it has certain advantages compared with traditional deep learning methods. The recognition rate is 99%, 98.5%, and 97.84%, respectively.
Article
Full-text available
Deep Convolutional Neural Network (CNN) is a special type of Neural Networks, which has shown exemplary performance on several competitions related to Computer Vision and Image Processing. Some of the exciting application areas of CNN include Image Classification and Segmentation, Object Detection, Video Processing, Natural Language Processing, and Speech Recognition. The powerful learning ability of deep CNN is primarily due to the use of multiple feature extraction stages that can automatically learn representations from the data. The availability of a large amount of data and improvement in the hardware technology has accelerated the research in CNNs, and recently interesting deep CNN architectures have been reported. Several inspiring ideas to bring advancements in CNNs have been explored, such as the use of different activation and loss functions, parameter optimization, regularization, and architectural innovations. However, the significant improvement in the representational capacity of the deep CNN is achieved through architectural innovations. Notably, the ideas of exploiting spatial and channel information, depth and width of architecture, and multi-path information processing have gained substantial attention. Similarly, the idea of using a block of layers as a structural unit is also gaining popularity. This survey thus focuses on the intrinsic taxonomy present in the recently reported deep CNN architectures and, consequently, classifies the recent innovations in CNN architectures into seven different categories. These seven categories are based on spatial exploitation, depth, multi-path, width, feature-map exploitation, channel boosting, and attention. Additionally, the elementary understanding of CNN components, current challenges, and applications of CNN are also provided.
Conference Paper
Full-text available
Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at https://github.com/liuzhuang13/DenseNet.
Article
Full-text available
We propose a new framework for estimating generative models via an adversarial process, in which we simultaneously train two models: a generative model G that captures the data distribution, and a discriminative model D that estimates the probability that a sample came from the training data rather than G. The training procedure for G is to maximize the probability of D making a mistake. This framework corresponds to a minimax two-player game. In the space of arbitrary functions G and D, a unique solution exists, with G recovering the training data distribution and D equal to 1/2 everywhere. In the case where G and D are defined by multilayer perceptrons, the entire system can be trained with backpropagation. There is no need for any Markov chains or unrolled approximate inference networks during either training or generation of samples. Experiments demonstrate the potential of the framework through qualitative and quantitative evaluation of the generated samples.
Article
This article is devoted to the technologies of web scraping (web crawling) for Node.js, used in the task of aggregating information and estimating the parameters of cargo routes by extracting data from open sources. The challenge of web scraping occurs in many different contexts, both scientific and industrial. The tasks of web scraping have both wide practical applications and a significant educational aspect. However, the existing material on web scraping is scattered and unstructured. In this paper, using the example of solving the scientific and technical problem of aggregating information and evaluating the parameters of cargo routes by extracting data from open sources, an overview of the technologies for web scraping on Node.js is presented, the classification of sites by complexity is described, the systematization of the features of sites that are an obstacle to scrapinf is given, and possible ways to bypass them. Thus, the didactic goal of this article is achieved to systematize the material on parsing websites.
Article
Text-based captchas are still widely used by many websites such as Wikipedia and Microsoft despite the emergence of many alternative captchas. Recently, the design of text-based captchas has become more and more complex to resist attacks from automatic cracking programs. However, most of the existing captcha solving methods have certain shortcomings, such as insufficient accuracy, poor generalization performance, and the need for a large number of labeled samples. This study proposes a fast captcha solver that can effectively break text-based captchas with complex security features using a small amount of labeled data. The solver was achieved by constructing a captcha transformation model based on generative adversarial networks to simplify the captcha images before character segmentation and recognition. Results showed that the proposed captcha solver achieved a high success rate of over 96% character accuracy and 74% captcha accuracy for all evaluated schemes. Moreover, the average time to process a single captcha image using a laptop GPU was only 4–8 ms. The effectiveness of this work may encourage captcha designers to reconsider a more secure human–machine distinction mechanism.
Chapter
CAPTCHAs (completely automated public turing test to tell computers and humans apart) are automated turing tests to identify whether the user is a human or not. There are several types of CAPTCHAs available; among them, most of the CAPTCHAs are based on characters. These are some distortions of character images and symbols identifiable by most human beings but are difficult for machines like a computer to recognize. CAPTCHAs are used for many security reasons in various sectors. We can see that existing alphanumeric CAPTCHAs are still difficult to recognize by humans also. In comparison with the traditional English CAPTCHAs, Indic languages like Hindi and Bengali are difficult to recognize. For this purpose, we use deep learning models like convolution neural networks to recognize CAPTCHAs automatically. In this paper, we have used three Indic languages, and two of them are northern language, one is Hindi, the other is Bengali, and one southern language Tamil as Indic CAPTCHAs. We generated the dataset with the help of the Pillow library for all three languages. After experiments, we observed that the model has reasonable accuracy for the Bengali language, Hindi language, and for the Tamil language.KeywordsCAPTCHARecognitionImage processingConvolution neural networks
Article
As a widely deployed security scheme, text-based completely automated public Turing tests to tell computers and humans apart (CAPTCHAs) have become increasingly unable to resist machine learning-based attacks. So far, many researchers have conducted studies on approaches for attacking text-based CAPTCHAs deployed by different companies, such as Microsoft, Amazon, and Apple, and achieved specific results. However, most of these attacks have shortcomings, such as the poor portability of attack methods, which require a series of data preprocessing steps and rely on large amounts of labeled CAPTCHAs. In this study, we propose an efficient and simple end-to-end attack method based on cycle-consistent generative adversarial networks (Cycle-GANs). Compared to previous studies, our approach significantly reduces the cost of data labeling. Additionally, this method has high portability. It can attack ordinary text-based CAPTCHA schemes only by modifying a few configuration parameters, which makes the attack easier to execute. First, we train CAPTCHA synthesizers based on the Cycle-GAN to generate some fake samples. Basic recognizers based on a convolutional recurrent neural network are trained using the fake data. Subsequently, an active transfer learning method is employed to optimize the basic recognizer utilizing tiny amounts of labeled real-world CAPTCHA samples. Our approach efficiently cracked the CAPTCHA schemes deployed by 10 popular websites, indicating that our attack method may be universal. Additionally, we analyzed the current most popular anti-recognition mechanisms. The results show that the combination of more anti-recognition mechanisms can improve the security of CAPTCHAs. However, the improvement is limited. Conversely, generating more complex CAPTCHAs may cost more resources and reduce the usability of CAPTCHAs.
Article
Text-based CAPTCHAs are the most widely used CAPTCHA scheme. Most text-based CAPTCHAs have been cracked. However, previous works have mostly relied on a series of preprocessing steps to attack text CAPTCHAs, which was complicated and inefficient. In this paper, we introduce a simple, generic and effective end-to-end attack on text CAPTCHAs without any preprocessing. Through a convolutional neural network and an attention-based recurrent neural network, our attack broke a wide range of real-world text CAPTCHAs that are deployed by the top 50 most popular websites ranked by Alexa.com. Additionally, this paper comprehensively analyzed the security of most resistance mechanisms of text-based CAPTCHAs through experiments. The experimental results prove that the anti-segmentation principle can be completely broken under deep learning attacks without any segmentation or preprocessing steps, in contrast to commonly held beliefs.
Technical Report
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively.
Conference Paper
There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net .
Article
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif-ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make train-ing faster, we used non-saturating neurons and a very efficient GPU implemen-tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.
Технологии парсинга на Node.js в задаче агрегации сведений и оценки параметров грузовых маршрутов посредством извлечения данных из открытых источников // Компьютерные инструменты в образовании. 2021. № 3. C. 41-56
  • А А Корепанова
  • Ф В Бушмелев
  • А А Сабреков
Корепанова А.А., Бушмелев Ф.В., Сабреков А.А. Технологии парсинга на Node.js в задаче агрегации сведений и оценки параметров грузовых маршрутов посредством извлечения данных из открытых источников // Компьютерные инструменты в образовании. 2021. № 3. C. 41-56. https://doi.org/10.32603/2071-2340-2021-3-41-56
Deep-CAPTCHA: A deep learning based CAPTCHA solver for vulnerability assessment // ERN: Neural Networks & Related Topics (Topic)
  • Z Noury
  • M Rezaei
Noury Z., Rezaei M. Deep-CAPTCHA: A deep learning based CAPTCHA solver for vulnerability assessment // ERN: Neural Networks & Related Topics (Topic). 2020. https://doi.org/10.2139/ ssrn.3633354
  • K Simonyan
  • A Zisserman
Simonyan K., Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv, 2014, arXiv:1409.1556. https:// doi.org/10.48550/arXiv.1409.1556
A survey of the recent architectures of deep convolutional neural networks // Artificial Intelligence Review
  • A Khan
  • A Sohail
  • U Zahoora
  • A S Qureshi
Khan A., Sohail A., Zahoora U., Qureshi A.S. A survey of the recent architectures of deep convolutional neural networks // Artificial Intelligence Review. 2020. V. 53. N 8. P. 5455-5516. https://doi. org/10.1007/s10462-020-09825-6
  • M Mirza
  • S Osindero
Mirza M., Osindero S. Conditional generative Adversarial Nets. arXiv, 2014, arXiv:1411.1784. https://doi.org/10.48550/ arXiv.1411.1784
Автоматизация проверки непротиворечивости идеалов конъюнктов с оценками вероятности истинности // Информационная безопасность регионов России (ИБРР-2021): материалы XII Санкт-Петербургской межрегиональной конференции. 2021. C
  • А А Вяткин
  • А Л Тулупьев
Вяткин А.А., Тулупьев А.Л. Автоматизация проверки непротиворечивости идеалов конъюнктов с оценками вероятности истинности // Информационная безопасность регионов России (ИБРР-2021): материалы XII Санкт-Петербургской межрегиональной конференции. 2021. C. 330-332.
Применение третичной структуры алгебраической байесовской сети в задаче апостериорного вывода // Вестник Южно-Уральского го сударственного университета. Серия: Вычислительная математика и информатика. 2023. Т. 12. № 1. C. 61-88
  • А А Вяткин
  • М В Абрамов
  • Н А Харитонов
  • А Л Тулупьев
Вяткин А.А., Абрамов М.В., Харитонов Н.А., Тулупьев А.Л. Применение третичной структуры алгебраической байесовской сети в задаче апостериорного вывода // Вестник Южно-Уральского го сударственного университета. Серия: Вычислительная математика и информатика. 2023. Т. 12. № 1. C. 61-88. https://doi.org/10.14529/cmse230104
Применение алгебраических байесовских сетей в задаче распознавания рукописных символов // Региональная информатика и информационная безопасность: сборник трудов Юбилейной XVIII Санкт-Петербургской международной конференции. 2022. C
  • А А Вяткин
  • Н А Харитонов
  • А Л Тулупьев
Вяткин А.А., Харитонов Н.А., Тулупьев А.Л. Применение алгебраических байесовских сетей в задаче распознавания рукописных символов // Региональная информатика и информационная безопасность: сборник трудов Юбилейной XVIII Санкт-Петербургской международной конференции. 2022. C. 538-542.
Automation of consistency checking of ideals of conjuncts with truth probability estimates. Information Security of Russian Regions (ISRR-2021)
  • A Vyatkin
  • A Tulupyev
Vyatkin A., Tulupyev A. Automation of consistency checking of ideals of conjuncts with truth probability estimates. Information Security of Russian Regions (ISRR-2021). Proc. of the XII St. Petersburg Interregional Conference, 2021, pp. 330-332. (in Russian).
Application of tertiary structure of algebraic bayesian network in the problem of a posteriori inference
  • A Vyatkin
  • M Abramov
  • N Kharitonov
  • A Tulupyev
Vyatkin A., Abramov M., Kharitonov N., Tulupyev A. Application of tertiary structure of algebraic bayesian network in the problem of a posteriori inference. Bulletin of the South Ural State University. Series "Computational Mathematics and Computer Science", 2023, vol. 12, no. 1, pp. 61-88. (in Russian). https://doi.org/10.14529/cmse230104
Application of algebraic bayesian networks in handwritten character recognition. Regional Informatics and Information Security
  • A Vyatkin
  • N Kharitonov
  • A Tulupyev
Vyatkin A., Kharitonov N., Tulupyev A. Application of algebraic bayesian networks in handwritten character recognition. Regional Informatics and Information Security. Proc. of the Anniversary XVIII St. Petersburg International Conference, 2022, pp. 538-542. (in Russian).