# Alexey PotapovSingularityNET · AGI Research Lab

Alexey Potapov

PhD, DrSci

## About

90

Publications

14,819

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

237

Citations

Citations since 2016

Introduction

My main interest is Artificial General Intelligence, but I have ~20 years of experience on computer vision and pattern recognition (both purely scientific research and commercial R&D projects). Now, I'm trying to bridge the gap between efficient narrow AI methods and universal algorithmic intelligence models using probabilistic programming, metacomputations, and algorithmic information theory.

Additional affiliations

February 2018 - present

**SingularityNET**

Position

- Head of Department

January 2013 - June 2018

## Publications

Publications (90)

Efficient pragmatic methods in artificial intelligence can be treated as results of specialization of models of universal intelligence with respect to a certain task or class of environments. Thus, specialization can help to create efficient AGI preserving its universality. This idea is promising, but has not yet been applied to concrete models. He...

Problems of decision criteria in tasks of image analysis and pattern recognition are considered. Overlearning as a practical consequence of fundamental paradoxes in inductive inference is illustrated by examples. Theoretical (based on algorithmic complexity) and practical formulations of the minimum description length (MDL) principle are given. A d...

Rational agents are usually built to maximize rewards. However, AGI agents
can find undesirable ways of maximizing any prior reward function. Therefore
value learning is crucial for safe AGI. We assume that generalized states of
the world are valuable - not rewards themselves, and propose an extension of
AIXI, in which rewards are used only to boot...

Solomonoff induction is known to be universal, but incomputable. Its
approximations, namely, the Minimum Description (or Message) Length (MDL)
principles, are adopted in practice in the efficient, but non-universal form.
Recent attempts to bridge this gap leaded to development of the
Representational MDL principle that originates from formal decomp...

Universal induction is a crucial issue in AGI. Its practical applicability
can be achieved by the choice of the reference machine or representation of
algorithms agreed with the environment. This machine should be updatable for
solving subsequent tasks more efficiently. We study this problem on an example
of combinatory logic as the very simple Tur...

We introduce a formal meta-language for probabilistic programming, capable of expressing both programs and the type systems in which they are embedded. We are motivated here by the desire to allow an AGI to learn not only relevant knowledge (programs/proofs), but also appropriate ways of reasoning (logics/type systems). We draw on the frameworks of...

We consider homotopy type theory (HoTT) as a possible basis for Artificial General Intelligence (AGI) and study how it will frame the traditional problems of symbolic Artificial Intelligence (AI), which are not avoided, but can be addressed in a constructive way. We conclude that HoTT is suitable for building a language of a cognitive architecture,...

This book constitutes the refereed proceedings of the 14th International Conference on Artificial General Intelligence, AGI 2021, held as a hybrid event in San Francisco, CA, USA, in October 2021.
The 36 full papers presented in this book were carefully reviewed and selected from 50 submissions. The papers cover topics from foundations of AGI, to A...

Many benchmarks and challenges for AI and AGI exist, which help to reveal both short- and long-term topics and directions of research. We analyze elementary school Olympiad math tasks as a possible benchmark for AGI that can occupy a certain free niche capturing some limitations of the existing neural and symbolic systems better than other existing...

This paper introduces a new algorithm for unsupervised learning of keypoint detectors and descriptors, which demonstrates fast convergence and good performance across different datasets. The training procedure uses homographic transformation of images. The proposed model learns to detect points and generate descriptors on pairs of transformed image...

This book constitutes the refereed proceedings of the 13th International Conference on Artificial General Intelligence, AGI 2020, held in St. Petersburg, Russia, in September 2020.
The 30 full papers and 8 short papers presented in this book were carefully reviewed and selected from 60 submissions. The papers cover topics such as AGI architectures,...

The necessity for neural-symbolic integration becomes evident as more complex problems like visual question answering are beginning to be addressed, which go beyond such limited-domain tasks as classification. Many existing state-of-the-art models are designed for a particular task or even benchmark, while general-purpose approaches are rarely appl...

Probabilistic logic reasoning is a central component of such cognitive architectures as OpenCog. However, as an integrative architecture, OpenCog facilitates cognitive synergy via hybridization of different inference methods. In this paper, we introduce a differentiable version of Probabilistic Logic networks, which rules operate over tensor truth...

Cross-dataset transfer learning is an important problem in person re-identification (Re-ID). Unfortunately, not too many deep transfer Re-ID models exist for realistic settings of practical Re-ID systems. We propose a purely deep transfer Re-ID model consisting of a deep convolutional neural network and an autoencoder. The latent code is divided in...

In this paper we propose a conceptual framework for higher-order artificial neural networks. The idea of higher-order networks arises naturally when a model is required to learn some group of transformations, every element of which is well-approximated by a traditional feedforward network. Thus the group as a whole can be represented as a hyper net...

Image and video retrieval by their semantic content has been an important and challenging task for years, because it ultimately requires bridging the symbolic/subsymbolic gap. Recent successes in deep learning enabled detection of objects belonging to many classes greatly outperforming traditional computer vision techniques. However, deep learning...

What frameworks and architectures are necessary to create a vision system for AGI? In this paper, we propose a formal model that states the task of perception within AGI. We show the role of discriminative and generative models in achieving efficient and general solution of this task, thus specifying the task in more detail. We discuss some existin...

Cross-dataset transfer learning is an important problem in person re-identification (Re-ID). Unfortunately, not too many deep transfer Re-ID models exist for realistic settings of practical Re-ID systems. We propose a purely deep transfer Re-ID model consisting of a deep convolutional neural network and an autoencoder. The latent code is divided in...

In this paper we propose a conceptual framework for higher-order artificial neural networks. The idea of higher-order networks arises naturally when a model is required to learn some group of transformations, every element of which is well-approximated by a traditional feedforward network. Thus the group as a whole can be represented as a hyper net...

What frameworks and architectures are necessary to create a vision system for AGI? In this paper, we propose a formal model that states the task of perception within AGI. We show the role of discriminative and generative models in achieving efficient and general solution of this task, thus specifying the task in more detail. We discuss some existin...

Image and video retrieval by their semantic content has been an important and challenging task for years, because it ultimately requires bridging the symbolic/subsymbolic gap. Recent successes in deep learning enabled detection of objects belonging to many classes greatly outperforming traditional computer vision techniques. However, deep learning...

Person re-identification (Re-ID) is the task of matching humans across cameras with non-overlapping views that has important applications in visual surveillance. Like other computer vision tasks, this task has gained much with the utilization of deep learning methods. However, existing solutions based on deep learning are usually trained and tested...

The concept of the technological singularity is frequently reified. Futurist forecasts inferred from this imprecise reification are then criticized, and the reified ideas are incorporated in the core concept. In this paper, I try to disentangle the facts related to the technological singularity from more speculative beliefs about the possibility of...

Universal induction relies on some general search procedure that is doomed to be inefficient. One possibility to achieve both generality and efficiency is to specialize this procedure w.r.t. any given narrow task. However, complete specialization that implies direct mapping from the task parameters to solutions (discriminative models) without searc...

The problem of representing and learning complex visual stimuli in the context of modeling the process of conditional reflex formation is considered. The generative probabilistic framework is chosen which has been recently successfully applied to cognitive modeling. A model capable of learning different visual stimuli is developed in the form of a...

This study presents an analysis of the causes of insufficient efficiency of the nearest neighbor method, compared with deep learning networks. The primary cause is the incorrect use of Euclidean distance to the nearest neighbor for estimating the distance from the analyzed pattern to a region occupied by a class. To overcome this problem, it is nec...

Possibility to solve the problem of planning and plan recovery for robots using probabilistic programming with optimization queries, which is being developed as a framework for AGI and cognitive architectures, is considered. Planning can be done directly by introducing a generative model for plans and optimizing an objective function calculated via...

Subject of Research. The paper deals withthe process of visual concept building based on two unlabeled sources of information (visual and textual). Method. Visual concept-based learning is carried out with image patterns and lexical elements simultaneous conjunction. Concept-based learning consists of two basic stages: early learning acquisition (p...

Probabilistic programming is considered as a framework, in which basic components of cognitive architectures can be represented in unified and elegant fashion. At the same time, necessity of adopting some component of cognitive architectures for extending capabilities of probabilistic programming languages is pointed out. In particular, implicit sp...

The problem of training autoencoders (with logistic regression as the classification layer) on sets of small sizes is considered on the example of image classification and scene categorization tasks. Conventional autoencoders with uniform priors usually fail to learn useful features from few samples. A possibility to overcome this difficulty is con...

Subject of Study. The subject of research is the information structure of objects internal representations and operations over them, used by man to solve the problem of mental rotation of figures. To analyze this informational structure we considered not only classical dependencies of the correct answers on the angle of rotation, but also the other...

This paper gives an analysis of the role of generative models in image processing and computer vision. Oriented and unoriented graphical models (Bayesian and Markov networks) are considered, along with the possibilities of using them in image processing, in particular, to solve problems of noise filtering, segmentation, and stereo vision. Probabili...

The clustering problem is solved, using probabilistic programming languages belonging to two families-languages that implement graphical models (Infer. NET) and arbitrary computable generative models (Church). A comparison is made of the features and efficiency of the implementations. It is established that the Infer. NET language has higher accura...

Methods of simulated annealing and genetic programming over probabilistic program traces are developed firstly. These methods combine ex-pressiveness of Turing-complete probabilistic languages, in which arbitrary ge-nerative models can be defined, and search effectiveness of meta-heuristic methods. To use these methods, one should only specify a ge...

Application of the Minimum Description Length principle to optimization queries in probabilistic programming was investigated on the example of the C++ probabilistic programming library under development. It was shown that incorporation of this criterion is essential for optimization queries to behave similarly to more common queries performing sam...

Methods based on genetic programming for the problem solution of integer sequences extrapolation are the subjects for study in the paper. In order to check the hypothesis about the influence of language expression of program representation on the prediction effectiveness, the genetic programming method based on several limited languages for recurre...

This book constitutes the refereed proceedings of the 8th International Conference on Artificial General Intelligence, AGI 2015, held in Berlin, Germany in July 2015. The 41 papers were carefully reviewed and selected from 72 submissions. The AGI conference series has played, and continues to play, a significant role in this resurgence of research...

The problem of bridging the gap between efficient but narrow methods of machine learning, and universal but inefficient methods was considered. Our main claim, which is methodologically important to the field of Artificial General Intelligence (AGI), is that neither narrow nor basic universal methods are sufficient for AGI. This claim was illustrat...

Deep learning is promising approach to extract useful nonlinear representations of data. However, it is usually applied with large training sets, which are not always available in practical tasks. In this paper, we consider stacked autoencoders with logistic regression as the classification layer and study their usefulness for the task of image cat...

We consider image features on the base of histograms of oriented gradients (HOG) with addition of contour curvature histogram (HOG-CH), and also compare it with results of known scale-invariant feature transform (SIFT) approach in application to retrieval of images of smooth 3D objects.

The bidirectional heteroassociative neural network model is developed on the basis of psychophysical experiments studying processes of memorization of the hand movements’ sequence. The model qualitatively reproduces the different characteristics of human errors. The learning processes are simulated with the help of QLBAM algorithm. The revealed dif...

Possibility of practical application of algorithmic probability is analyzed on an example of image inpainting problem that precisely corresponds to the prediction problem. Such consideration is fruitful both for the theory of universal prediction and practical image inpaiting methods. Efficient application of algorithmic probability implies that it...

This paper discusses the criterion of algorithmic probability, which offers a general solution to the question of extrapolating symbolic strings. The given criterion extends the theoretical-informational approach based on algorithmic complexity, which is widely used to synthesize image-analysis methods. Methodological recommendations are given conc...

Optimal probabilistic approach in reinforcement learning is computationally
infeasible. Its simplification consisting in neglecting difference between true
environment and its model estimated using limited number of observations causes
exploration vs exploitation problem. Uncertainty can be expressed in terms of a
probability distribution over the...

Solomonoff universal induction based on Algorithmic Probability (ALP) can be considered as the general theoretical basis for machine learning and, in particular, pattern recognition. However, its practical application encounters very difficult problems. One of them is incomputability caused by usage of the Turing-complete solution space. The Minimu...

Text line segmentation is one of important tasks in automatic text analysis. In this paper, we propose two different approaches to estimate a text block orientation. Presented approaches are applicable under several assumptions based on the text lines structure.

The problem of formal criteria for image representations is considered.
The inability of existing universal compression algorithms to take into
account the specificity of subject area confirmed experimentally. A
method for compressing biomedical images without losses based on
geometric normalization (alignment) of images is suggested and
discussed.

The problem of learning of complex visual stimuli in cognitive robotics
is considered. These stimuli should be selected on the base of rules
supporting arbitrary comparisons of stimulus features with features of
other salient objects (context). New perceptual knowledge representation
based on the predicate logic is implemented to express such rules...

The paper is devoted to facial image analysis and particularly deals
with the problem of automatic evaluation of the attractiveness of human
faces. We propose a new approach for automatic construction of feature
space based on a modified principal component analysis. Input data sets
for the algorithm are the learning data sets of facial images, whi...

This letter discusses the solution of the problem of automatic erythrometry, using a modified Hough transform based on a method developed earlier for distinguishing and counting erythrocytes. The proposed method makes it possible to construct a Price–Jones curve from the images of blood smears.

Kolmogorov complexity and algorithmic probability are compared in the context of the universal algorithmic intelligence. Accuracy of time series prediction based on single best model and on averaging over multiple models is estimated. Connection between inductive behavior and multi-model prediction is established. Uncertainty as a heuristic for red...

This paper discusses a method of compressing three-dimensional biomedical images with losses, based on representing the data in the form of an octree. A modification of the method by geometrical normalization (equalization) of the image is proposed. It is shown that the proposed method has greater efficiency than the compression obtained when one e...

This paper discusses modern problems of video informatics in the area of the formation, transmission, processing, analysis, and visualization of video information. The distinguishing feature of video informatics is that it treats these problems from a unified theoretical viewpoint, and this allows the characteristics of video-information systems to...

This paper analyzes the necessity of using many information representations simultaneously in image-processing and -analysis systems. The difference between Kolmogorov-complexity and algorithmic-probability criteria when solving induction problems and making decisions is investigated. It is shown that making the optimum decisions (for example, in r...

This paper presents the results of the development and the characteristics of an optodigital complex that automatic forms, records, and processes images of biomedical objects for purposes of noninvasive diagnosis based on digital microscopy and endoscopy. The complex provides the collection, preliminary analysis, and compression of video informatio...

Minimum description length (MDL) principle is one of the well-known solutions for overlearning problem, specifically for artificial neural networks (ANNs). Its extension is called representational MDL (RMDL) principle and takes into account that models in machine learning are always constructed within some representation. In this paper, the optimiz...

Existing theoretical universal algorithmic intelligence models are not
practically realizable. More pragmatic approach to artificial general
intelligence is based on cognitive architectures, which are, however,
non-universal in sense that they can construct and use models of the
environment only from Turing-incomplete model spaces. We believe that...

A model for memorizing sequences of movements (spatial positions) based on the hetero-associative neural network is considered. A criterion for estimating correctness of memorized data is proposed as the number of iterations necessary for transition into some steady state. This criterion is in good agreement with psychophysiological facts and can b...

This paper discusses a method of constructing a depth map of a scene from a set of microscope images of the scene. For each coordinate of the image field, the number of the layer in which the variance over a window of definite size takes its maximum value is determined. It is proposed to preprocess the images before the stage of computing the varia...

System for estimating the motion of independently moving objects observed by a moving camera is presented. It consists of feature matching and multi-body motion estimating modules. Novel set of invariant features is proposed on the base of phase spectrum differentiation without information loss. Clustering the feature points and estimating the tran...

Problems of decision criterion in the tasks of image analysis and pattern recognition are considered. Overlearning as a practical
consequence of fundamental paradoxes in inductive inference is illustrated with examples. Theoretical (on the base of algorithmic
complexity) and practical formulations of the minimum description length (MDL) principle a...

This paper discusses image-comparison methods used in the modern automatic-navigation systems of mobile robots. Features of the analysis of video data in problems involving the navigation of unmanned aircraft and terrestrial robots that operate indoors and outdoors are analyzed. The prospects of further improving the automatic image-analysis algori...

This paper discusses trends in the use of vision subsystems in modern robotic systems--in particular, in household robots that function in an indeterminate medium. The main image-processing tasks associated with tasks of navigating mobile robots are indicated. Using as an example the methods put into practice for comparing images obtained in a clos...

This paper discusses methods of using color information in solving problems of image comparison and the recognition and tracking of objects in a closed space. These problems are characteristic of the computer-vision systems of mobile robots. Methods of distinguishing contours on the basis of the Cumani operator are discussed, as well as of detectin...

This paper proposes a new shift-invariant image representation, based on the operation of differentiation of the phase component of the spectrum, to replace the traditionally used operation, which totally eliminates phase information. The new representation is used to modify the Fourier-Mellin method, which is intended for the comparison of images...

Learning is one of the most crucial components, which increases generality, flexibility, and robustness of computer vision systems. At present, image analysis algorithms adopt particular machine learning methods resulting in rather superficial learning. We present a new paradigm for constructing essentially learnable image analysis algorithms. Lear...

Two approaches to constructing systems of local invariant image indicators are compared: the SURF method and a method based on the incomplete Fourier-Mellin transform. The results for images made indoors show that the two methods are of comparable quality.

Investigation of images representations in Optical Coherence Tomography (OCT) is carried out based on the objective quality criterion introduced with the use of the novel representational minimum description length prin-ciple. Several image segmentation algorithms are proposed that recover layered structure of the OCT im-ages and can be used for mo...

Based on the representational-minimum-description-length (RMDL) principle, proposed earlier and intended for the quantitative estimate of the degree of invariance of image representations, a comparative analysis has been carried out of several segmentation algorithms that construct contour descriptions of images, as well as several algorithms for c...

This paper discusses the problem of quantitatively describing the laws of perceptive grouping developed in Gestalt psychology and their use in computer vision. For a unified description of the Gestalt laws, the principle of representational minimum description length is proposed. Psychophysical experiments have been carried out in which it is estab...

This paper discusses the problem of introducing feedback in multilevel machine-vision systems. Based on a theoretical-informational analysis, it is shown that feedback is needed in such systems because the individual components of the quality criterion common to the entire system are optimized at different levels. As a result, the decisions made at...

This paper presents a catalog of algorithms for automatically processing aerospace pictures, including more than fifty items, systematized according to the following topics: structural comparison of images, comparison of images on the basis of Fourier invariants, alignment of images in a single coordinate system, detection of changes of the terrain...

This paper discusses the problem of determining the best spatial transformation over the set of reference points found in the process of comparing two images. It is shown that it is allowable to use the rms-deviation criterion only when choosing among transformations with an identical number of parameters. Otherwise, the preference is given to the...

This paper discusses a theoretic-informational approach to the problem of pattern recognition. Using methods of generalized decision functions, support vectors, and finite mixtures as an example, it is shown that the existing quality criteria of the decision rules can be represented as particular implementations of the principle of minimum descript...

The notion of Gestalt was developed in order to give a unified explanation for various phenomena of human perception and cognition. This aim wasn’t achieved due to insufficient strictness of the approach. Because of the same reason, the Gestalt theory didn’t influence noticeably the field of computer vision. Nevertheless, it seems that the basic Ge...

The investigation presented in this article continues our long-term efforts directed towards the automatic structural matching of aerospace photographs. An efficient target independent hierarchical structural matching tool was described in our previous paper, which, however, was aimed mostly for the analysis of 2D scenes. It applied the same geomet...

The aim of investigation consists in development of a formal image representation, in whose framework the most relevant information can be extracted from images. Constructing the models of images is considered as a task of inductive inference. The conventional criterions for choosing the best model are based on the Bayesian rule. However there is o...

We present an approach to automatic sup-pixel precise measurement of the positions and local orientations of the holes and edge peculiarities of complex shapes in a sheet metal. The sub-pixel precision of measurement is reached by means of a model-based image analysis. A correlation based measure is introduced to obtain the measurement results inva...

Last years we reported at the SPIE conferences the results of development of a hierarchical structural classifier which used the contour structural elements as an input and was designed for matching the aerospace photographs taken in different seasons from different view points, or formed by different kinds of sensors. The aim of this investigation...

Last years we reported at the SPIE conferences the results of development of a hierarchical structural classifier which used the contour structural elements as an input and was designed for matching the aerospace photographs taken in different seasons from different view points, or formed by different kinds of sensors. The aim of this investigation...

This paper discusses how the mutual spatial transformation of local segments of images affects the result of their phase correlation. The equations obtained for the noise of the cross-correlation field caused by scale mismatch of segments of the images or their relative rotation make it possible to estimate the allowable values of the coefficients...

This paper proposes a local-correlation method that makes it possible to bring aerospace images into coincidence with subpixel accuracy after preliminary rough juxtaposition by transforming them relative to each other by uniform projective transformation and by additional mutual local displacements. The basis of the method is to establish a corresp...

We present an information-theoretic approach to the image interpretation problems. In the context of this approach such tasks as contour extracting, constructing the most informative image features and image matching are described as a single unified problem. Our approach is based primarily on the interpretation of the image (or image set) represen...