Content uploaded by Charles Davi

Author content

All content in this area was uploaded by Charles Davi on Mar 09, 2021

Content may be subject to copyright.

Analyzing Dataset Consistency

Charles Davi

March 8, 2021

Abstract

Many consumer devices can be used to perform parallel computations,

and in a series of approximately ﬁve-hundred research notes,1I introduced

a new and comprehensive model of artiﬁcial intelligence rooted in infor-

mation theory and parallel computing that allows for classiﬁcation and

prediction in worst-case polynomial time, using what is eﬀectively Au-

toML, since the user only needs to provide the datasets. This results in

run-times that are simply incomparable to any other approach to A.I. of

which I’m aware, with classiﬁcations at times taking seconds over datasets

comprised of tens of millions of vectors, even when run on consumer de-

vices. Below is a series of formal lemmas, corollaries, and proofs, that

form the theoretical basis of my model of A.I.

1 Local Consistency

A dataset Ais locally consistent if there is a value δx>0, for each x∈A,

such that the set of all points for which ||x−yi|| ≤ δx,Bx={y1, . . . , yk}, has

a single classiﬁer, equal to the classiﬁer of x, and Bxis non-empty, for all x.

That is, all of the points within Bxare within δxof x, and are of a single class.

We say that a dataset is locally consistent around a single point x∈A, if Bxis

non-empty.

For reference, the nearest neighbor algorithm simply returns the vector

within a dataset that has the smallest Euclidean distance from the input vector.

Expressed symbolically, if f(x, A) implements the nearest neighbor algorithm,

as applied to x, over A, and y=f(x, A), for some y6=x, then ||x−y|| ≤ ||x−z||,

for all z∈A. Further, assume that if there is at least one yi∈A, such that

||x−y|| =||x−yi||, then the nearest neighbor algorithm will return a set of

outputs, {y, y1, . . . , yk}.

1All of the research notes and applicable code are publicly available on my ResearchGate

homepage.

1

Lemma 1.1. If a dataset Ais locally consistent, then the nearest neighbor

algorithm will never generate an error.

Proof. Assume that the nearest neighbor algorithm is applied to some dataset

A, and that Ais locally consistent. Further, assume that for some input vector

x, the output of the nearest neighbor algorithm, over A, contains some vector

y, and that yhas a classiﬁer that is not equal to the classiﬁer of x, thereby

generating an error. Since yis contained in the output of the nearest neighbor

algorithm, as applied to x, over A, it follows that ||x−y|| is the minimum over

the entire dataset. Because Ais locally consistent, there is some nonempty set

Bx, that contains every vector yi, for which, ||x−yi|| ≤ δx. Since ||x−y|| is

minimum, it must be the case that ||x−y|| ≤ ||x−yi|| ≤ δx, for all yi. But

this implies that y∈Bx, which contradicts our assumption that the classiﬁers

of xand yare not equal. Therefore, if we nonetheless assume their classiﬁers

are not equal, then Ais not locally consistent, which leads to a contradiction,

thereby completing the proof.

What Lemma 1.1 says is that the nearest neighbor algorithm will never

generate an error when applied to a locally consistent dataset, since the nearest

neighbor of an input vector xis always an element of Bx, which guarantees

accuracy. As a practical matter, if the nearest neighbor isn’t within δxof an

input vector x, then we can ﬂag the prediction as possibly erroneous. This

could of course happen when your training dataset is locally consistent, and

your testing dataset adds new data that does not ﬁt into any Bx.

Corollary 1.2. A dataset is locally consistent if and only if the nearest neighbor

algorithm does not generate any errors.

Proof. Lemma 1.1 shows that if a dataset Ais locally consistent, then the

nearest neighbor algorithm will not generate any errors. So now assume that

the nearest neighbor algorithm generates no errors when applied to A, and for

each x∈A, let yxdenote a nearest neighbor of x, and let δx=||x−yx||. Let

Bx=yx, and so each Bxis non-empty. If more than one vector is returned by

the nearest neighbor algorithm for a given x, then include all such vectors in

Bx. By deﬁnition, ||x−yx|| ≤ δx. Therefore, there is a value δx>0, for each

x∈A, such that the set of all points for which ||x−yi|| ≤ δx,Bx, has a single

classiﬁer, and Bxis non-empty for all such x, which completes the proof.

Corollary 1.3. A dataset is locally consistent around x∈A, if and only if the

nearest neighbor algorithm does not generate an error, when applied to x, over

A.

Proof. Assume that the nearest neighbor algorithm does not generate an error

when applied to x∈A. It follows that there is some B={y1, . . . , yk} ⊆ A,

2

generated by the nearest neighbor algorithm, as applied to x, over A, for which

||x−yi|| is minimum, and the classiﬁers of xand yiare all equal. Let Bx=B,

and let δx=||x−yi||. Since, the classiﬁers of xand yiare all equal, this implies

that Ais locally consistent around x.

Now assume that Ais locally consistent around x, and let B={y1, . . . , yk} ⊆

Abe the output of the nearest neighbor algorithm, as applied to x, over A. It

follows that ||x−yi|| is minimum over A, and therefore, B⊆Bx. This implies

that the classiﬁer of xequals the classiﬁer of each yi, which completes the

proof.

2 Clustering

We can analogize the idea of local consistency to geometry, which will create

intuitive clusters that occupy some portion of Euclidean space. We say that

a dataset Ais clustered consistently, if for every x∈A, there exists some

δx>0, such that all points contained within a sphere of radius δx(including

its boundary), with an origin of x, denoted Sx, have the same classiﬁer as x.

Lemma 2.1. A dataset is locally consistent if and only if it is clustered consis-

tently.

Proof. Assume Ais locally consistent. It follows that for each x∈A, there

exists a non-empty set of vectors Bx, each of which have the same classiﬁer as

x. Moreover, for each yi∈Bx, it is the case that ||x−yi|| ≤ δx. This implies

that all vectors within Bxare contained within a sphere of radius δx, with an

origin of x. Therefore, Ais clustered consistently.

Now assume that Ais clustered consistently. It follows that all of the vectors

yiwithin Sxhave a distance of at most δx, from x. Let Bxbe the set of all

vectors within Sx. This implies that for every x∈A, there exists a non-

empty set of vectors Bx, all with the same classiﬁer, such that for all yi∈Bx,

||x−yi|| ≤ δx>0. Therefore, Ais locally consistent, which completes the

proof.

What Lemma 2.1 says, is that the notion of local consistency is equivalent

to an intuitive geometric deﬁnition of a cluster that has a single classiﬁer.

Lemma 2.2. If a dataset Ais locally consistent, and two vectors x1and x2have

unequal classiﬁers, then there is no vector y∈A, such that y∈C= (Sx1∩Sx2).

Proof. If the regions bounded by the spheres Sx1and Sx2do not overlap, then

Cis empty, and therefore cannot contain any vectors at all. So assume instead

3

that the regions bounded by the spheres Sx1and Sx2do overlap, and that

therefore, Cis a non-empty set of points, and further, that there is some y∈C,

such that y∈A.

There are three possibilities regarding the classiﬁer of y:

(1) It is equal to the classiﬁer of x1;

(2) It is equal to the classiﬁer of x2;

(3) It is not equal to the classiﬁers of x1or x2.

In all three cases, it follows that y∈Bx1, and y∈Bx2, which leads to a

contradiction, since the Lemma assumes that the classiﬁers of x1and x2are not

equal. Since these three cases cover all possible values for the classiﬁer of y, it

must be the case that ydoes not exist, which completes the proof.

What Lemma 2.2 says, is that if two clusters overlap as spheres, then their

intersecting region contains no vectors from the dataset.

Lemma 2.3. If Ais locally consistent, then any clustering algorithm that ﬁnds

the correct value of δx, for all x∈A, can generate clusters over Awith no

errors.

Proof. This follows immediately from Lemma 2.2, since any such algorithm

could simply generate the sphere Sx, for all x∈A.

4