PreprintPDF Available
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In a previous paper, I introduced a new model of Artificial Intelligence rooted in information theory that can solve essentially any machine learning or deep learning problem in low-degree polynomial time. In this paper, I'm going to introduce an application called Prometheus that consolidates this model into a simple GUI interface that allows for totally autonomous machine learning and deep learning. The user simply selects the training data, testing data, and Prometheus autonomously generates models and predictions using a single core learning algorithm, with no further input from the user. I'll demonstrate the generality of Prometheus by applying it to basic machine learning using a UCI dataset, and a deep learning video classification problem, where Prometheus will distinguish between someone waving their left hand and their right hand.
Content may be subject to copyright.
Autonomous Deep Learning
Charles Davi
August 17, 2019
Abstract
In a previous paper [1],1I introduced a new model of Artificial Intelli-
gence rooted in information theory that can solve essentially any machine
learning or deep learning problem in low-degree polynomial time. In this
paper, I’m going to introduce an application called Prometheus that con-
solidates this model into a simple GUI interface that allows for totally
autonomous machine learning and deep learning. The user simply selects
the training data, testing data, and Prometheus autonomously generates
models and predictions using a single core learning algorithm, with no fur-
ther input from the user. I’ll demonstrate the generality of Prometheus
by applying it to basic machine learning using a UCI dataset, and a deep
learning video classification problem, where Prometheus will distinguish
between someone waving their left hand and their right hand.2
1 Prometheus
1.1 Summary
Prometheus is an Octave command line program that allows users to solve
essentially any machine learning or deep learning problem by simply selecting
the training file and testing file through a GUI. Though my full library of AI
software is extensive, the current non-commercial version of Prometheus allows
for only two types of tasks: (1) Machine Learning and (2) Function Predic-
tion. Though there are two distinct tasks, there is only one learning algorithm
that accomplishes both.
1“A New Model of Artificial Intelligence”, available on my researchgate homepage here.
2I retain all rights (copyright and otherwise) to the information, algorithms, and all other
works presented in this paper. In particular, the algorithms are NOT to be used for commercial
purposes without my prior written consent. For the avoidance of doubt, you may NOT modify,
or redistribute any of this material, in particular the algorithms, without my express, prior,
written consent.
1
Machine Learning covers essentially every machine learning or deep learn-
ing problem where the training dataset consists of N-dimensional vectors, with
a hidden classification column. There are no restrictions on the number of di-
mensions, and Prometheus can quickly and accurately solve problems for which
N15,000 on an ordinary consumer device. The non-commercial version of-
fers only limited pre-processing for datasets, and as a result, the datasets are
assumed to be structured as M(N+ 1) comma-separated value files, where M
is the number of input vectors, Nis the dimension of the dataset, and column
N+ 1 contains the hidden classification data, which is assumed to be an integer
value. As a result, there is no pre-processing in this version for missing data,
corrupted entries, or formatting issues. However, there is pre-processing to nor-
malize the dataset if some dimensions are significantly and consistently larger
than others. Normalization will occur automatically, and a GUI notification
will be given to the user that this is happening.
Function Prediction models functions of the form F:R2!R.The
learning engine is capable of modeling any function F:RM!RN,butthisis
not available in the non-commercial version, which restricts the input and output
dimensions. There is no pre-processing available at all for Function Prediction,
and the datasets are assumed to be structured as 3 Mcomma-separated value
files, where Mis the number of data points provided. Row (1) is assumed to
contain the xvalues of the function, row (2) is assumed to contain the yvalues
of the function, and row (3) is assumed to contain the z=F(x, y) values of the
function.
1.2 Installing Prometheus
The first step is to install Octave, which you can download for free from the
ocial GNU website.3The next step is to download my library of AI algorithms,
which is available as a .zip file on my technical blog.4Note that the library must
be saved to a directory path that is recognized by Octave. You should be able to
force Octave to recognize whatever directory path you select by simply opening
one of the library functions in the Octave editor.5
3https://www.gnu.org/software/octave/download.html
4https://www.researchgate.net/project/Information-Theory-SEE-PROJECT-
LOG/update/5d5893f6cfe4a7968dc182de
5You m a y h ave t o i n stall ad d i t i o n al packag e s f r o m t h e Octave comma n d l i n e , b u t t his is
very simple and can be done by following the instructions that app ear in the Octave command
line, prompting you to install any additional packages that are necessary to run my algorithms.
2
2 Using Prometheus
2.1 Machine Learning and Deep Learning
2.1.1 The Wine Dataset (Machine Learning)
Once that’s done, we can begin by solving a basic machine learning problem
using the The Wine Dataset 6provided by the UCI Machine Learning Reposi-
tory. This dataset consists of 178 data points, each with 13 dimensions of data,
together with a classifier that indicates the type of wine that each data point
represents. There are three classes of wines, represented by the numbers 1, 2,
and 3, respectively. The classification task is to identify the class of each wine
given the 13 dimensions of data.
First, download the actual data file by clicking the following link:
https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data
The Wine Dataset is not divided into a training dataset and a testing dataset,
and the classifier is in the first, not the last, column. As a result, we’ll need
to make some minor adjustments to the dataset before running Prometheus by
entering the following code into the Octave command line:
filename = "/Users/charlesdavi/Downloads/wine.data";
data_matrix = csvread(filename);
temp = data_matrix(:,1);
data_matrix(:,1) = data_matrix(:,14);
data_matrix(:,14) = temp;
training_rows = randperm(178,155);
training_rows = unique(training_rows);
training_matrix = data_matrix(training_rows,:);
data_matrix(training_rows,:) = [];
testing_matrix = data_matrix;
training_file = "/Users/charlesdavi/Desktop/External CSV Files/Wine_Training.csv";
testing_file = "/Users/charlesdavi/Desktop/External CSV Files/Wine_Testing.csv";
csvwrite(training_file,training_matrix);
csvwrite(testing_file,testing_matrix);
This will move the classifier to the last column, generate a training dataset
and testing dataset, and create two CSV files that contain the training data and
testing data, respectively. As a general matter, this type of pre-processing is
the only work that is necessary prior to running Prometheus, reducing machine
learning and deep learning to an administrative task.
6https://archive.ics.uci.edu/ml/datasets/Wine
3
You can now call Prometheus from the Octave command line using the
following code:
[output_matrix stats_array] = PrometheusAI_Data_Engine_Lite();
This will cause a warning to pop up, notifying you that this is not a commer-
cial product, and that you cannot use this version of Prometheus for commercial
purposes.7Click, “OK”, and then another message box will appear asking you
to select the training dataset. Click, “OK” yet again, and the following browser
window will open up asking you to select the file that contains the training
dataset:
Figure 1: The browser window prompting you to select the training dataset.
Simply select the training file for the Wine Dataset using the path entered
into the Ocatve command line, and then click, “Open”. A second message box
will appear prompting you to select the testing file. Select the testing file in
the same manner, and then the following dialog box will appear asking you to
select a task:
7If you’re interested in purchasing a commercial version of Prometheus, you can contact
me using the email address listed on my SSRN page here.
4
Figure 2: The dialog box prompting you to select a task.
Select, “Machine Learning”. This will cause the core learning engine to run.
In this case, the values in the 1st dimension of the Wine Dataset are significantly
larger than the values in all other dimensions, and as a result, the normalization
algorithm will automatically run, generating the following progress bar:
Figure 3: The progress bar notifying you of data normalization.
Once normalization is complete, the normalized dataset is then used to gen-
erate a prediction model for each class of data in the dataset. In this case, there
are three classes of data, and therefore, Prometheus will generate three models
that will then be applied to the testing dataset. A progress bar will appear
during this process notifying you that model generation is underway. Once the
models have been generated, Prometheus will normalize the testing dataset (if
applicable), and then begin to generate predictions using the models generated
in the previous step. This will cause another progress bar to appear notifying
you that predictions are being generated. Once all of the predictions have been
generated, the actual prediction data gets stored in the “output matrix” re-
turned by the command line function, and the statistics related to the task get
stored in the “stats array” returned by the command line function. Both sets
of values can be analyzed using Octave. For Machine Learning, the accuracy is
the first entry in the stats array, expressed as a real number from 0 to 1. For the
Wine Dataset, because the training datasets and testing datasets are randomly
selected from the entire dataset, accuracy can vary, but you should produce an
accuracy between 86% and 100%.
5
The runtime for each stage of the process is automatically printed to the
command line. In this case, running on an iMac 3.2 GHz Intel Core i5, the
Data Normalization runtime was 10.5229 seconds; Model Generation runtime
was 4.27431 seconds; and Prediction runtime was 0.35838 seconds, for a total
of 15.156 seconds from start to finish.
2.1.2 Video Classification (Deep Learning)
In this case, the classification task will be to determine what type of motion is
taking place in a dataset of videos. The moving object is me: in one class of
videos, I wave my left hand, and in the other class of videos, I wave my right
hand. It will then be up to Prometheus to classify a testing dataset of videos,
given the training dataset, and determine in a given video, whether I’m waving
my left hand, or my right hand.
Unlike the Wine Dataset, which had a dimension of N= 13, this problem
will generate a dataset with a much larger dimension of N=1,000. As a
general matter, it would not ordinarily be practical to solve a problem with
such a high dimension on an ordinary consumer device using machine learning
or deep learning models. But not only is it practical to solve these types of
problems running Prometheus on an ordinary consumer device, it’s fast.
Returning to the example at hand, each video consists of 10 frames, and
each of the two classes of videos consists of 12 training videos, for a total of 24
videos, and 240 frames. The testing dataset consists of 10 videos for each class,
for a total of 20 videos and 200 frames.8Since this version of Prometheus needs
CSV files, we’ll need to translate the information in these videos into CSV form.
This is accomplished by running my vectorized boundary detection algorithm,
which can quickly extract shape information from an image.
Figure 4: One “left-hand” frame, and the boundaries identified in the frame.
8The image files for this dataset can be found on Dropbox here. The code necessary to
solve this classification problem can be found here.
6
Figure 4 shows a frame from one of the videos, together with the boundary
information extracted by the algorithm. The boundary information is then
transformed into a set of two-dimensional points that represents the coordinates
of the boundaries in the image. This process samples a fixed number of points
from the boundary data, in this case 500 points. Each point consists of an x
and a ycoordinate, and therefore, we need N=2500 = 1,000 dimensions to
represent the shape of a single frame.
The frames are approximately 700 KB each. Extracting the shape informa-
tion from a single frame takes about 1.8 seconds (on my iMac). Extracting the
shape information from the training dataset took 453.842 seconds, and extract-
ing the shape information from the testing dataset took 361.01 seconds, for a
total of 814.85 seconds, which is approximately 13.5minutes.
Once the shape information has been extracted, we can then write that
shape information to two CSV files, generating a training dataset and a testing
dataset, and then we call Prometheus, selecting the applicable files. Note that at
this step, we’re actually classifying the individual frames, not the videos. This
will generate a sequence of independent predictions for each video, where a
prediction is made for each frame in the video. That is, we begin by classifying
each frame of every video, rather than classifying the video as a whole. For
example, if 1 is the classifier for a left-hand video, and 2 is the classifier for
a right-hand video, then a left-hand video could have a prediction vector of
the form [2,1,1,1,1,2,1,1,1,2,1].This process will generate a 10-dimensional
vector of predictions for each video, and the 11-th dimension is the hidden
classifier for the video. This method gives us more information about the video
than attempting to classify the entire video at once, since we instead generate
a number of predictions equal to the number of frames in the video.
Each frame has a classifier, and so even though we’re not done yet, there is
an accuracy associated with this task, which is simply the percentage of correct
image classifications. In this case, the frame classification task had an accuracy
of 82.5%. Data Normalization was not required; Model Generation runtime was
20.5961 seconds; and Prediction runtime was 4.66209 seconds, for a total of
25.258 seconds.
For the next step, we’ll classify the 10-dimensional prediction vectors gen-
erated during the frame classification task. But rather than run Prometheus
twice, in the code available on my technical blog, I simply call the categoriza-
tion algorithm that underlies Prometheus, which will generate clusters of these
prediction vectors. We can then measure the accuracy of the entire process by
measuring the percentage of clusters that consist of only one class of prediction
vectors. That is, we run the categorization algorithm on the prediction vec-
tors, and then test whether the resultant clusters are consistent with the hidden
classifiers. This will ultimately create clusters of videos, which solves the classi-
fication problem, if our clusters are consistent with the hidden classifier data for
7
the videos. In this case, the clusters had an accuracy of 100%. The run time for
this last process was 0.380437 seconds. If we include all of the time that elapsed
from shape extraction to final classification, then the total is 840.49 seconds, or
roughly 14 minutes.
So, in summation, using Prometheus, in conjunction with my AI library,
we can quickly and accurately classify high-resolution video files based upon
the motions and shapes of the objects in the videos on an ordinary consumer
device. This is just one example of the type of high-dimensional problems that
Prometheus can be used to solve: a creative data scientist would be able to
leverage these libraries to solve far more sophisticated problems.9
9Using Prometheus to do Function Prediction is analogous, and so I don’t address it in
this note. For a technical explanation of how my algorithms achieve function prediction, you
can read the following research note on my personal blog here.
8
ResearchGate has not been able to resolve any citations for this publication.
ResearchGate has not been able to resolve any references for this publication.