Content uploaded by Charles Davi
Author content
All content in this area was uploaded by Charles Davi on Mar 09, 2021
Content may be subject to copyright.
Analyzing Dataset Consistency
Charles Davi
March 8, 2021
Abstract
Many consumer devices can be used to perform parallel computations,
and in a series of approximately five-hundred research notes,1I introduced
a new and comprehensive model of artificial intelligence rooted in infor-
mation theory and parallel computing that allows for classification and
prediction in worst-case polynomial time, using what is effectively Au-
toML, since the user only needs to provide the datasets. This results in
run-times that are simply incomparable to any other approach to A.I. of
which I’m aware, with classifications at times taking seconds over datasets
comprised of tens of millions of vectors, even when run on consumer de-
vices. Below is a series of formal lemmas, corollaries, and proofs, that
form the theoretical basis of my model of A.I.
1 Local Consistency
A dataset Ais locally consistent if there is a value δx>0, for each x∈A,
such that the set of all points for which ||x−yi|| ≤ δx,Bx={y1, . . . , yk}, has
a single classifier, equal to the classifier of x, and Bxis non-empty, for all x.
That is, all of the points within Bxare within δxof x, and are of a single class.
We say that a dataset is locally consistent around a single point x∈A, if Bxis
non-empty.
For reference, the nearest neighbor algorithm simply returns the vector
within a dataset that has the smallest Euclidean distance from the input vector.
Expressed symbolically, if f(x, A) implements the nearest neighbor algorithm,
as applied to x, over A, and y=f(x, A), for some y6=x, then ||x−y|| ≤ ||x−z||,
for all z∈A. Further, assume that if there is at least one yi∈A, such that
||x−y|| =||x−yi||, then the nearest neighbor algorithm will return a set of
outputs, {y, y1, . . . , yk}.
1All of the research notes and applicable code are publicly available on my ResearchGate
homepage.
1
Lemma 1.1. If a dataset Ais locally consistent, then the nearest neighbor
algorithm will never generate an error.
Proof. Assume that the nearest neighbor algorithm is applied to some dataset
A, and that Ais locally consistent. Further, assume that for some input vector
x, the output of the nearest neighbor algorithm, over A, contains some vector
y, and that yhas a classifier that is not equal to the classifier of x, thereby
generating an error. Since yis contained in the output of the nearest neighbor
algorithm, as applied to x, over A, it follows that ||x−y|| is the minimum over
the entire dataset. Because Ais locally consistent, there is some nonempty set
Bx, that contains every vector yi, for which, ||x−yi|| ≤ δx. Since ||x−y|| is
minimum, it must be the case that ||x−y|| ≤ ||x−yi|| ≤ δx, for all yi. But
this implies that y∈Bx, which contradicts our assumption that the classifiers
of xand yare not equal. Therefore, if we nonetheless assume their classifiers
are not equal, then Ais not locally consistent, which leads to a contradiction,
thereby completing the proof.
What Lemma 1.1 says is that the nearest neighbor algorithm will never
generate an error when applied to a locally consistent dataset, since the nearest
neighbor of an input vector xis always an element of Bx, which guarantees
accuracy. As a practical matter, if the nearest neighbor isn’t within δxof an
input vector x, then we can flag the prediction as possibly erroneous. This
could of course happen when your training dataset is locally consistent, and
your testing dataset adds new data that does not fit into any Bx.
Corollary 1.2. A dataset is locally consistent if and only if the nearest neighbor
algorithm does not generate any errors.
Proof. Lemma 1.1 shows that if a dataset Ais locally consistent, then the
nearest neighbor algorithm will not generate any errors. So now assume that
the nearest neighbor algorithm generates no errors when applied to A, and for
each x∈A, let yxdenote a nearest neighbor of x, and let δx=||x−yx||. Let
Bx=yx, and so each Bxis non-empty. If more than one vector is returned by
the nearest neighbor algorithm for a given x, then include all such vectors in
Bx. By definition, ||x−yx|| ≤ δx. Therefore, there is a value δx>0, for each
x∈A, such that the set of all points for which ||x−yi|| ≤ δx,Bx, has a single
classifier, and Bxis non-empty for all such x, which completes the proof.
Corollary 1.3. A dataset is locally consistent around x∈A, if and only if the
nearest neighbor algorithm does not generate an error, when applied to x, over
A.
Proof. Assume that the nearest neighbor algorithm does not generate an error
when applied to x∈A. It follows that there is some B={y1, . . . , yk} ⊆ A,
2
generated by the nearest neighbor algorithm, as applied to x, over A, for which
||x−yi|| is minimum, and the classifiers of xand yiare all equal. Let Bx=B,
and let δx=||x−yi||. Since, the classifiers of xand yiare all equal, this implies
that Ais locally consistent around x.
Now assume that Ais locally consistent around x, and let B={y1, . . . , yk} ⊆
Abe the output of the nearest neighbor algorithm, as applied to x, over A. It
follows that ||x−yi|| is minimum over A, and therefore, B⊆Bx. This implies
that the classifier of xequals the classifier of each yi, which completes the
proof.
2 Clustering
We can analogize the idea of local consistency to geometry, which will create
intuitive clusters that occupy some portion of Euclidean space. We say that
a dataset Ais clustered consistently, if for every x∈A, there exists some
δx>0, such that all points contained within a sphere of radius δx(including
its boundary), with an origin of x, denoted Sx, have the same classifier as x.
Lemma 2.1. A dataset is locally consistent if and only if it is clustered consis-
tently.
Proof. Assume Ais locally consistent. It follows that for each x∈A, there
exists a non-empty set of vectors Bx, each of which have the same classifier as
x. Moreover, for each yi∈Bx, it is the case that ||x−yi|| ≤ δx. This implies
that all vectors within Bxare contained within a sphere of radius δx, with an
origin of x. Therefore, Ais clustered consistently.
Now assume that Ais clustered consistently. It follows that all of the vectors
yiwithin Sxhave a distance of at most δx, from x. Let Bxbe the set of all
vectors within Sx. This implies that for every x∈A, there exists a non-
empty set of vectors Bx, all with the same classifier, such that for all yi∈Bx,
||x−yi|| ≤ δx>0. Therefore, Ais locally consistent, which completes the
proof.
What Lemma 2.1 says, is that the notion of local consistency is equivalent
to an intuitive geometric definition of a cluster that has a single classifier.
Lemma 2.2. If a dataset Ais locally consistent, and two vectors x1and x2have
unequal classifiers, then there is no vector y∈A, such that y∈C= (Sx1∩Sx2).
Proof. If the regions bounded by the spheres Sx1and Sx2do not overlap, then
Cis empty, and therefore cannot contain any vectors at all. So assume instead
3
that the regions bounded by the spheres Sx1and Sx2do overlap, and that
therefore, Cis a non-empty set of points, and further, that there is some y∈C,
such that y∈A.
There are three possibilities regarding the classifier of y:
(1) It is equal to the classifier of x1;
(2) It is equal to the classifier of x2;
(3) It is not equal to the classifiers of x1or x2.
In all three cases, it follows that y∈Bx1, and y∈Bx2, which leads to a
contradiction, since the Lemma assumes that the classifiers of x1and x2are not
equal. Since these three cases cover all possible values for the classifier of y, it
must be the case that ydoes not exist, which completes the proof.
What Lemma 2.2 says, is that if two clusters overlap as spheres, then their
intersecting region contains no vectors from the dataset.
Lemma 2.3. If Ais locally consistent, then any clustering algorithm that finds
the correct value of δx, for all x∈A, can generate clusters over Awith no
errors.
Proof. This follows immediately from Lemma 2.2, since any such algorithm
could simply generate the sphere Sx, for all x∈A.
4