PreprintPDF Available

Qubic AGI Journey Human and Artificial Intelligence: Toward an AGI with Aigarth

Authors:
  • MindBigData.com
  • Universidad Internacional de La Rioja
Preprints and early-stage research may not have been peer reviewed yet.

Abstract

We present an integrated analysis of human intelligence and its artificial counterpart through Qubic's Aigarth framework, beginning by examining the biological foundations of intelligence, from Spearman's g factor to contemporary neuroscientific understanding of predictive processing in the brain. The analysis encompasses the evolution of human cognitive abilities, particularly focusing on the social brain hypothesis and the role of neural efficiency in information processing. Building upon these biological insights, we introduce Aigarth, a novel approach in the journey towards artificial general intelligence (AGI) development that represents a paradigm shift from traditional GPU-dependent architectures to CPU-based distributed computing systems. We introduce a ternary computing paradigm that extends beyond conventional binary systems, incorporating TRUE, FALSE, and UNKNOWN states for enhanced information processing. The framework employs "Intelligent Tissue," a self-modifying neural network structure that evolves through natural selection principles, to build emergent problem-solving capabilities, a novel scoring algorithm that evaluates network performance through deterministic connection generation and asynchronous updates. The results indicate that Aigarth's decentralized approach could potentially overcome limitations of current AI systems while promoting democratized AI development. Our research contributes to the broader understanding of evolutionary approaches to AGI development and presents implications for future artificial consciousness studies.
* Equal Contribuon
Qubic AGI Journey
Human and Arcial Intelligence: Toward an AGI with Aigarth
Jose Sanchez *
1 Qubic Scienc Advisor
2 Universidad Internacional de La Rioja (UNIR)
jose.sanchezgarcia@unir.net
David Vivancos *
1 Qubic Scienc Advisor
2 Arciology Research
vivancos@vivancos.com
Abstract:
We present an integrated analysis of human
intelligence and its arcial counterpart
through Qubic’s Aigarth framework,
beginning by examining the biological
foundaons of intelligence, from
Spearman's g factor to contemporary
neuroscienc understanding of predicve
processing in the brain. The analysis
encompasses the evoluon of human
cognive abilies, parcularly focusing on
the social brain hypothesis and the role of
neural eciency in informaon processing.
Building upon these biological insights, we
introduce Aigarth, a novel approach in the
journey towards arcial general
intelligence (AGI) development that
represents a paradigm shi from tradional
GPU-dependent architectures to CPU-based
distributed compung systems. We
introduce a ternary compung paradigm
that extends beyond convenonal binary
systems, incorporang TRUE, FALSE, and
UNKNOWN states for enhanced informaon
processing. The framework employs
"Intelligent Tissue," a self-modifying neural
network structure that evolves through
natural selecon principles, to build
emergent problem-solving capabilies, a
novel scoring algorithm that evaluates
network performance through determinisc
connecon generaon and asynchronous
updates.
The results indicate that Aigarth's
decentralized approach could potenally
overcome limitaons of current AI systems
while promong democrazed AI
development. Our research contributes to
the broader understanding of evoluonary
approaches to AGI development and
presents implicaons for future arcial
consciousness studies.
Keywords:
Arcial General Intelligence (AGI),
Intelligent Tissue, Ternary Compung,
Neural Evoluon, Decentralized AI, Self-
modifying Networks, Biological Intelligence,
Qubic
1. Introducon. On Human Intelligence
"Intelligence is what you do when you don't
know what to do" (Bereiter, 1995).
Since the early 20th century, dierenal
psychology has aempted to measure the
construct of intelligence. To idenfy children
with special educaonal needs, Alfred Binet
designed a test to assess cognive funcons
(Binet & Simon, 1905). Based on Binet's
work, Charles Spearman was the rst author
to dene a general mental capacity
underlying various cognive tasks, later
termed as a theorecal construct
"g", represenng the general intelligence.
Through factor analysis, Spearman found
that performance across dierent tests and
tasks had a high correlaon, suggesng a
general capacity (Spearman, 1904).
Spearman's theory faced cricism, notably
from L.L. Thurstone, who proposed a
muldimensional approach to intelligence
(Thurstone, 1938). However, it was later
found that Thurstone’s primary abilies—
verbal uency, reasoning, memory, and
spaal percepon—also correlated,
suggesng the presence of a general
underlying factor. In 1963, Raymond Caell
proposed two components within the g
factor: uid intelligence (Gf) and crystallized
intelligence (Gc). Fluid intelligence is linked
to the ability to reason and solve new
problems, closely aligning with Bereiter’s
popular denion, while crystallized
intelligence relates to the knowledge
acquired through experience and learning
over me (Caell, 1941, 1963). This
muldimensional approach to intelligence
received support from John Horn, who
included components like short-term
memory (Gsm), spaal visualizaon (Gv),
auditory discriminaon (Ga), long-term
memory and retrieval (Glr), and processing
speed (Gs).
Building on the contribuons of Caell and
Horn, John B. Carroll proposed a three-
stratum hierarchical model aer conducng
extensive factor analyses of 460 studies
(Carroll, 1993). At Stratum 3 is g, at Stratum
2 up to 16 broad abilies appear (uid
intelligence, crystallized intelligence, long-
term memory, working memory, processing
speed, visual processing, auditory
processing, reacon and decision speed,
psychomotor processing, quantave
knowledge, reading comprehension and
uency, short-term visual memory,
ideaonal uency, rapid retrieval memory,
perceptual speed, and kinesthec
processing). Each ability subdivides into
several at Stratum 1, totaling up to 70
(Schneider & McGrew, 2012).
The CHC (Caell-Horn-Carroll) model is
currently the most widely accepted in
educaonal, research, and clinical sengs
(McGrew, 2009). It is used in the most
popular psychometric tests for evaluang
intelligence, such as the WISC-V (Wechsler,
2014), WAIS-IV (Wechsler, 2008), Woodcock-
Johnson IV (Scharnk et al. 2014), and the
Stanford-Binet 5 (Roid, 2003).
Factor analysis (FA) is the common
stascal technique used to idenfy the
underlying structure (factors) within a set of
observed variables as cognive skills tasks.
FA main objecve is to reduce the
dimensionality of the data by grouping
correlated variables into factors, which are
interpreted as unobservable latent
constructs. This method is widely applied to
uncover latent relaonships between
variables and simplify big datasets. If there
are no prior hypotheses about the number
or nature of factors, FA is exploratory.
Factor analysis is a type of stascal linear
model:
X = ΛF + ε
Where:
X: Vector of observed variables (p
variables).
Λ: Matrix of factor loadings (p×m,
being m the number of factors).
F: Vector of latent factors (m
factors).
ε: Vector of unique errors (noise).
For each observed variable, the model can
be wrien as:
Xi = λi1 F1 + λi2 F2 +
+ λim Fm + εi
Where λij represents the factor loading of
the i-th variable on the j-th factor, and εi is
the unique variance of the i-th variable.
FA starts with the correlaon matrix RRR,
expressing the correlaons among the
observed variables. Some methods as
Maximum Likelihood (ML) or Principal
Component Analysis are used to extract
factors that maximize the common variance
among variables. Depending on the quanty
of variance explained, a number of factors is
selected. Factor analysis is widely used to
test intelligence, personality and other
psychological constructs (Li et al. 2024) .
Fig. 1. CHC hierarchical model of intelligence: the g factor
1.1. Validity and Reliability of g
Every psychological measurement must
objecvely validate the inferences made
about a construct from the measuring
instrument. Tradionally, this is achieved
through two pillars: validity and reliability.
Validity encompasses not only the technical
aspects of measurement but also ethical
and social consideraons (Messick, 1989).
Considering all these aspects, the general
intelligence factor presents a high degree of
content validity, which assesses whether
items represent the construct's domain
(Jensen, 1998). It also has high criterion
validity, as g shows strong correlaons with
various levels of professional and
educaonal performance (Schmidt &
Hunter, 1998). Moreover, it demonstrates
strong construct validity, conrmed through
factor analysis for the existence of a single
factor (Carroll, 1993). The factor also
exhibits high substanve validity, indicang
how test items are based on theory and
involve the cognive tasks being evaluated
(Jensen, 1998). It has strong external
validity, allowing for generalizaon across
dierent populaons or contexts (Nisbe et
al. 2012), and consequenal validity, which
evaluates the impact of the test on decision-
making in academic or professional
environments (Sternberg, 2004).
Reliability focuses on the consistency of
measurement, evaluated through test-
retest reliability, inter-rater reliability, and
internal consistency measures (split-half
tests, Cronbach's alpha, hierarchical
omega). Cronbach’s alpha measures the
internal consistency of a set of items
measuring the same construct (Messick,
1995).
Where:
k: Number of items in the scale.
Si2: Variance of the i-th item.
St2: Total variance of the scale (sum
of all item scores).
Although Cronbach’s alpha is commonly
used, hierarchical Omega ts beer for
muldimensional constructs such as
intelligence. It quanes the proporon of
total variance aributable to a general
factor in muldimensional scales.
Where:
ω: symbol for the omega coecient
λi: standardized factor loading of i
The construct of g has shown strong test-
retest reliability (Deary et al. 2000) and
internal consistency (Raven, 2000; Reise,
2013; Wechsler, 1997).
1.2. Predicve Value of g
The g factor is a robust predictor of behavior.
From school age, the impact of g is evident.
It is assessed by specic and standardized
tests resembling dierent cognive tasks
(spaal, logical and verbal) and situaons,
measured by a quoent, the IQ (intelligence
quoent). In a longitudinal study in the UK,
IQ assessed during childhood reliably
predicted performance in general secondary
educaon exams, with stable correlaons
between 0.60 and 0.80 (Deary et al., 2007).
The predicve value extends beyond
secondary educaon and remains
signicant in higher educaon, parcularly
in disciplines with high demands for abstract
reasoning (Deary et al., 2009).
According to a meta-analysis by Schmidt
and Hunter (1998), g correlates between
0.51 and 0.70 in the context of job
performance, parcularly when the job
requires complex cognive skills. This
esmaon also accounts for factors like
learning speed and adaptaon to dynamic
environments (Schmidt & Hunter, 1998).
Kuncel et al. (2014) suggest that employees
with higher general intelligence are more
ecient in their individual tasks and
contribute signicantly to team
performance due to their problem-solving
abilies and adaptability. Regarding remote
work, Salgado et al. (2020) show how the g
factor eecvely predicts high producvity
levels and adaptaon to new digital
plaorms. In the current context of constant
change and rapid technological evoluon,
employees with high levels of g hold
compeve advantages in terms of
adaptability, cognive demands, and
responsibility (Goredson, 2004).
In terms of physical and mental health,
individuals with higher levels of g tend to live
longer and have a lower risk of chronic
diseases. Although various explanatory
causes are suggested, g may funcon as a
mechanism that allows for beer
acquision of health-related informaon,
adherence to medical treatments, and
avoidance of risky behaviors (Goredson,
2004). Similarly, the longitudinal study by
Bay et al. (2009), involving a million men in
Sweden over 20 years, found a direct
relaonship between youth IQ and adult
mortality, even aer controlling for
socioeconomic status and lifestyle factors.
Several studies by Ian Deary nd that people
with higher general intelligence exhibit
beer adherence to treatments for chronic
diseases, beer understand medical
instrucons, and are more likely to make
lifestyle modicaons, resulng in beer
long-term outcomes (Deary et al., 2008,
2010).
Intelligence is also posively associated with
the propensity to form healthy habits, such
as sustained physical acvity, a balanced
diet, and avoiding tobacco use. The impact
of g on various life condions is already
evident from childhood. A longitudinal study
involving 33,000 parcipants over 50 years
found that a higher g value in early life was
associated with a lower risk of heart disease
and stroke (Calvin et al., 2017). Bay et al.
(2018) invesgated the relaonship
between g and cancer risk, nding that
individuals with higher childhood IQ had a
signicantly lower risk of developing various
types of cancer, potenally due to greater
adherence to prevenon programs and
healthier behavior.
Mentally, individuals with higher g exhibit a
lower incidence of mental disorders, beer
stress management, reduced prevalence of
depression and anxiety, and greater
psychological resilience (Gale et al., 2017).
When faced with traumac or stressful
events, such as illness, bereavement, or
unemployment, those with higher g
demonstrate beer coping abilies.
Regarding social intelligence, g predicts the
ability to use social networks harmoniously,
avoid scams, and discern between real and
fake news (Jackson & Wang, 2013). People
with higher g are more likely to engage in
social causes, parcipate in volunteer
programs, and understand complex social
issues (Nie et al., 1996), including polical
maers (Deary et al., 2008).
The famous Dunedin study in New Zealand,
which began in 1972-73, has followed over
1,000 individuals from birth to adulthood
over nearly 50 years, collecng and
analyzing data on health, disease,
intelligence, personality, development,
academic and professional performance,
and other social factors. The study provides
insight into the impact of intelligence-
related abilies in childhood on adult life
(Poulton et al., 2015). Its value lies in the
longitudinal measurement of the same
individuals over me. The studies conrm
that childhood IQ is a strong predictor of
future academic performance, educaonal
system adaptaon, learning ability, and
highest level of educaon achieved
(Fergusson & Horwood, 2007). It also
reveals that those with higher childhood IQs
are more likely to secure cognively
demanding jobs involving complex decision-
making and earn beer wages. In line with
other studies, the Dunedin sample research
shows that individuals with higher general
intelligence levels adopt healthier lifestyles,
are less prone to risky behaviors, beer
follow medical advice, exhibit beer general
health, suer fewer chronic diseases in
adulthood, and experience greater longevity
(Bay et al., 2007; Belsky et al., 2017).
Psychologically, they have lower rates of
depression and anxiety and possess beer
social support and trust-based relaonships
(Mo et al., 2002). Concerning criminal
and ansocial behavior, the Dunedin study
reveals that those with lower IQs struggle
more with social norms, make less raonal
decisions, and have diculty controlling
impulses (Mo, 1993).
Although cognive intelligence and
emoonal intelligence are oen separated
as if they were two dierent constructs,
social adaptaon involves emoonal
regulaon, adherence to norms, and
cooperave strategies. In Dunedin, it is
observed that those with lower g levels have
more conduct disorders in childhood and
adolescence and poorer social adaptaon in
adulthood. Conversely, individuals with
higher intellectual capacity display greater
social resilience, cognive exibility, beer
coping strategies for social and emoonal
challenges, and a higher socioeconomic
status throughout life (Mo & Caspi,
2000; Caspi, 1998; Shanahan et al., 2014).
Other studies, such as those by Robert
Hogan, demonstrate that intelligence also
applies to social and emoonal skills. People
with higher g build stable interpersonal
relaonships, cooperate, and eecvely
resolve life conicts (Hogan & Kaiser, 2005).
In romanc relaonships, those with higher
g levels empathize and understand their
partners beer, and communicate more
eciently (Roberts & Kuncel, 2007).
The ability to manage savings and nances,
plan for rerement, learn nancial literacy,
and avoid risky economic behaviors is
associated with higher g (Lusardi & Mitchell,
2007; Banks & Oldeld, 2007). Creavity,
although considered a separate construct by
some authors like Sternberg, correlates with
g. The ability for innovaon, generang new
ideas, learning quickly, and idenfying
opportunies is beer when starng from a
high g level (Shane, 2003; Kaufman &
Sternberg, 2010).
The g factor has proven to be a crucial
predictor of the tendency to experience
cognive decline and neurodegenerave
diseases. The most plausible explanaon
relates to a greater cognive reserve.
Cognive reserve is the brain's capacity to
compensate for funconal or structural
damage caused by neurodegenerave
diseases or natural aging (Livingston et al.,
2020). Even if two brains have equal ssue
damage, the person with greater cognive
reserve may show fewer and less intense
symptoms over a longer period (Stern et al.,
2019). Although educaon, occupaon,
quality social relaonships, and physical
acvity all contribute to building cognive
reserve throughout life, the g factor proves
to be an accurate predictor (Ferreira et al.,
2016; Soldan et al., 2017; Dekhtyar et al.,
2015). People with higher general
intelligence exhibit greater cognive
resilience in old age, with fewer clinical
symptoms of cognive decline (Whalley &
Deary, 2001). Regarding demena, McGurn
et al. (2008) indicate that those with higher
g in youth are more resistant to cognive
decline associated with neurodegenerave
diseases, exhibing symptoms of demena
about ve years later than average.
2. The g Factor and the Brain: Biological
Bases
Historically, since Binet and Spearman, the
study of intelligence has been approached
from psychometrics and psychology
(Spearman, 1904). With the advances in
neuroimaging techniques and genec
sequencing over the last 30 years, as well as
support for brain research in the USA and
Europe (Brain Iniave, Human Brain
Project), it has become possible to explore
the biological bases and neural correlates of
intelligence (Insel et al., 2013; Amunts et al.,
2016).
In 1991, with the rst improvements in
structural neuroimaging techniques, a 0.33
correlaon was found between brain
volume and IQ (Willerman et al., 1991),
which was later conrmed in a meta-
analysis by McDaniel (2005) with a
correlaon coecient close to 0.40. At least
within the Homo genus, subspecies with
larger brains tend to have higher g.
However, size is not the most crucial factor.
The speed of neural processing implies more
ecient mental processing of informaon.
Event-related potenals, which measure the
brain's responses to various smuli, are
faster and more synchronized in individuals
with higher g (Deary & Caryl, 1997).
Several studies highlight neural eciency.
People who show beer cognive
performance on a task exhibit a lower
demand for neural resources (Neubauer &
Fink, 2009) and greater funconal
connecvity between brain regions related
to the task (Basten et al., 2015). Brain
acvity is less diuse in people with higher g
when performing tasks, suggesng that
intelligence opmizes the use of neural
resources (Pahor et al., 2019). The neural
eciency hypothesis emerged in 1992 with
Richard Haier’s inial studies, where
individuals with higher intelligence levels
showed less acvaon and fewer resources
to solve complex cognive tasks (Haier et al.,
1992).
Haier later focused on studying the
implicated regions and their
interconnecons (Jung & Haier, 2007). This
led to the Parieto-Frontal Integraon Theory
of intelligence (P-FIT), linking g to a network
connecng the parietal and prefrontal
cortex. Previously, the importance of regions
such as the dorsolateral prefrontal cortex,
responsible for planning, decision-making,
sequenal behavior organizaon, and
cognive exibility, was studied within the
context of "execuve funcons." Other
regions, like the hippocampus involved in
memory consolidaon and learning—
primarily through spontaneous acvity in
sleep phase 2—indirectly relate to g.
However, the key to studying g in the brain
is the connecvity between regions and
eciency in processing. In a meta-analysis
that reviewed dozens of neuroimaging
studies, Basten, Hilger, and Fiebach (2015)
idened the regions correlang with
intelligence, conrming the P-FIT theory.
When individuals engage in complex
problem-solving, abstract reasoning, and
working memory acvaon, the fronto-
parietal network is parcularly relevant.
This network predominantly acvates on
the le side, as the le hemisphere has
some specializaon in language (both
comprehension and producon). Individuals
with higher intelligence show asymmetrical
acvaon, with a le-sided dominance
when solving complex mathemacal,
verbal, or spaal tasks (Jung et al., 2010).
Greater connecvity between the main
nodes of this network corresponds to beer
performance on cognive tasks (Hampshire
et al., 2012).
The g factor, therefore, links to the brain’s
capacity to eciently process informaon
within specic networks, primarily the le
fronto-parietal (Haier et al., 2009). A new
study, by Thiele et al. 2024 underscores the
distributed and evoluonarily adapve
nature of brain connecvity as a
cornerstone for human intelligence,
surpassing limitaons of the Parieto-Frontal
Integraon Theory (P-FIT).
Using machine learning to analyze data
from over 800 parcipants, predicve
models incorporang brain-wide
connecvity paerns explained up to 31% of
the variance in general intelligence,
outperforming models restricted to regions.
This novel approach highlights that
intelligence emerges from complex
interacons across mulple networks,
reecng the evoluonary renement of
cognive exibility and problem-solving
capacies.
If, as longitudinal studies like Dunedin
suggest, g in childhood is a predictor of all
kinds of performance in adulthood, it implies
that the general intelligence factor has a
strong genec basis.
To analyze the eect of genes on behavior,
samples of monozygoc twins separated at
birth, dizygoc fraternal twins, or children
adopted at a very early age are used. This
approach makes it possible to sciencally
assess the weight of nature versus nurture.
In a meta-analysis of various studies on
monozygoc twins, Plomin and Deary
(2015) found that the heritability of
intelligence increases with age, implying
that genec inuence on intelligence
becomes more prominent outside the
original family environment, underscoring
the strong impact of g in adulthood. Years
earlier, the heritability of the g factor was
esmated by studying twins raised together
or apart: between 50% and 80% of
intelligence is aributed to genecs
(Bouchard et al., 1990).
Genome-Wide Associaon Studies (GWAS)
have enabled the analysis of genec
polymorphisms associated with pathologies
or individual characteriscs by examining
the genomic associaons between specic
genec variants known as SNPs (single
nucleode polymorphisms) and the trait
under study (in this case, intelligence) within
a large populaon (Mills et al., 2019). Based
on data from the UK Biobank, numerous loci
(locaons) within the genome have been
found to have small eects on intelligence
variability. Recently, more than 500 genec
loci linked to g variability have been
discovered (Davies et al., 2018; Savage et
al., 2018).
Despite the substanal genec inuence on
intelligence, environment, nutrion,
educaon, and family background all aect
the expression and development of
intelligence (Benton, 2010). More enriched
environments, access to beer educaon
and culture, and the increasing cognive
demands of the labor market compared to
manual or mechanical skills have resulted in
an increase in intelligence throughout the
20th century. This is known as the Flynn
eect, which highlights the importance of
the environment in modulang the g factor
over me (Flynn, 1987).
2.1. From Carbon to Silicon
The journey from biological neural networks
to trying to create a replica in arcial
neural networks represents a fascinang
convergence of neuroscience and computer
science. In 1943, Warren McCulloch and
Walter Pis proposed the rst mathemacal
model of a neuron, showing that neural
events and the relaons among them could
be treated by means of proposional logic
(McCulloch and Pis, 1943).
This foundaonal work established that
neural networks of sucient complexity
could compute any logical funcon. Building
on this, Frank Rosenbla introduced the
perceptron in 1958, implemenng simple
but eecve algorithms for supervised
learning of binary classiers (Rosenbla,
1958), the perceptron mimicked a single
neuron's funcon by taking mulple inputs,
applying weights, and producing a binary
output based on a threshold, represenng
one of the earliest praccal
implementaons of neural computaon.
Despite many limitaons of the early
approaches, these models paved the way for
more complex arcial neural networks, like
current deep learning architectures, with
their mulple layers of interconnected
nodes, bear a striking resemblance to the
hierarchical structure of the human brain's
neural pathways (McClelland et al.,
1986). This biomimicry has proven
remarkably eecve, as networks can learn
representaons through dynamic,
distributed interacons within networks of
simple neuron-like processing units
(Hassabis et al., 2017). These biological
inspiraons have led to signicant
advancements in paern recognion,
natural language processing, and decision-
making capabilies of AI systems,
establishing what McClelland and
colleagues called "Parallel Distributed
Processing" (PDP), which more closely
approximates how actual neural circuits
perform computaon (McClelland et al.,
1986; Schmidhuber, 2015).
AI algorithms, at their core, are complex
computaonal processes that manipulate
and analyze data to perform tasks that
typically require human intelligence. The
eld of AI has both beneted from and
driven advancements in computaonal
power and eciency, since 2012, the
compung power used in the largest AI
training runs has grown exponenally,
increasing by roughly 10× per year
(Thompson et al., 2022). The development
of AI has been closely ed to Moore's Law
and specialized hardware acceleraon,
while tradional CPU improvements have
slowed, the introducon of GPU-based deep
learning inially yielded 5-15× speedups
which grew to more than 35× by 2012
(Thompson et al., 2022). This enabled
breakthrough achievements like AlexNet's
victory in the 2012 ImageNet compeon,
achieving a top-5 error rate of 16.4% using
deep convoluonal neural networks
(Krizhevsky et al., 2012). The quest for more
ecient AI computaon has led to
specialized hardware like Google's Tensor
Processing Unit (TPU), which oers 92
TeraOps/second of performance through a
65,536 8-bit MAC matrix mulply unit
(Jouppi et al., 2017). And new players like
Groq or Cerebras Language Processing Units
(LPUs) are raising the TeraOps/second bar
even further in late 2024.
However, these computaonal demands are
growing at a concerning rate. Research
shows that computaonal requirements for
deep learning are scaling polynomially with
performance improvements - for example,
halving remaining error rates can require
over 5,000× more computaon (Thompson
et al., 2022). This rapid escalaon in
compung needs raises important quesons
about the economic and environmental
sustainability of current deep learning
approaches.
2.2. The Social Brain Hypothesis
The structures and networks involved in the
development of g, as studied in the P-FIT
theory, are located in the neocortex. The
development of the neocortex is parcularly
signicant in the Homo genus compared to
other species and especially in Homo
sapiens versus Homo habilis, Homo erectus,
and other non-human primates like
chimpanzees, bonobos, gorillas, and
orangutans—all of which have higher
intelligence than other animals. The most
crucial factor for the unique development of
the neocortex in Homo sapiens has been
social pressure (Humphrey, 1976). Through
socially oriented intelligence, individuals
need to know, interact with, predict,
remember, and inuence the behavior of
other group members. This intelligence,
termed Machiavellian by Richard Byrne and
Andrew Whiten, suggests that decepon
and persuasion are essenal traits in
compeve social environments among
individuals to aid their survival. Therefore,
human intelligence may not result from
improved hunng, gathering, or similar
skills but rather from the demands of social
life itself (Byrne & Whiten, 1988).
A strong empirical conrmaon of this
hypothesis comes from the so-called Dunbar
number, which expresses the relaonship
between brain size, specically the
neocortex, and the size of social groups. For
humans, Dunbar's number is around 150
individuals, corresponding to the neocortex
size. In comparison, chimpanzees form small
groups of about 30-40 individuals (Dunbar,
1993). Dunbar suggests in the social brain
hypothesis that large-scale cooperaon,
joint coordinaon, group complexity, and
relaonships require a cognive capacity
superior to that of other primates and
species.
Fig. 2 Dunbar´s number. Group number and neocortex size rao
To manage social life, humans need the
ability to aribute mental states to others
and understand that they possess thoughts,
intenons, beliefs, and emoons that may
be similar or dierent from one's own. This
ability to perceive others as acve and
independent agents and remember past
interacons is known as mentalizaon or
theory of mind (Nowak & Sigmund, 2005).
Although other primates, mainly great apes,
exhibit mentalizaon abilies, their level is
quite rudimentary. In the human brain,
mentalizaon emerges around 4-5 years of
age, coinciding with various stages of
neurodevelopment maturaon, parcularly
in the temporoparietal and medial
prefrontal corces (Premack & Woodru,
1978; Carrington & Bailey, 2009). As a result
of social selecve pressure and the need for
eecve communicaon to facilitate
cooperaon, language is undeniably a
unique tool with a disnct specializaon in
humans, possessing a syntacc and
semanc structure rooted in logic—closely
aligning with the pure concept of
intelligence as the ability to adapt to a
changing environment. Language enables
the ecient transmission of concrete and
abstract informaon, fostering stronger
bonds, overcoming challenges, and solving
complex problems (Dunbar, 1996;
Tomasello, 2014; Dunbar & Schultz, 2007).
In fact, the creaon of culture and, thus, the
long-term modicaon of intelligence
requires language (Asngton & Baird,
2005). Communicaon through language
from parents to children promotes the
development of complex mental states and
the precise direcon in building cognive
skills (Dunn & Brophy, 2020). Interesngly,
language involves the maturaon of le-
dominant frontotemporoparietal areas.
Some of these areas, such as the arcuate
fasciculus, show visible dierences from
other primates and are essenal for
language comprehension and producon
(Friederici, 2017; Fitch, 2020).
2.3. Trying to build a Digital Brain
In the case of its Silicon counterpart, one of
the key developments was using layers to
increase the capacity in neural networks, a
fundamental discovery leading to the deep
learning revoluon. By stacking mulple
layers, neural networks can learn
hierarchical representaons of data,
capturing both low-level and high-level
features. As demonstrated by Rumelhart et
al. (1986), intermediate 'hidden' units can
represent important features of the task
domain, with regularies captured through
unit interacons. This depth enables models
to tackle complex tasks in all domains like
computer vision, natural language
processing, and speech recognion. LeCun
et al. (1989) showed that adding successive
layers allows networks to detect and
combine local features into higher-order
features, similar to biological visual systems,
however, increasing the number of layers
introduces challenges such as vanishing
gradients and computaonal ineciency.
Other developments have addressed these
issues through architectural innovaons,
techniques like residual connecons or
dropout have allowed gradients to ow
more eecvely during training, enabling
the construcon of very deep networks with
less performance degradaon. Moreover, as
evidenced in early work by LeCun et al.
(1989), architectural constraints and weight
sharing can help reduce free parameters
while maintaining computaonal power,
opmizing the balance between model
depth and cost. The connuous exploraon
of deeper architectures remains a crical
area of research, pushing the boundaries of
what neural networks can achieve, in the
dream of replicang a real human brain.
Determining the opmal way to connect
arcial neurons is fundamental, since the
connecvity paern dictates how
informaon ows and is processed within
the network, inial architectures like fully
connected layers are simple but
computaonally intensive and prone to
overng, to address these issues,
researchers have developed specialized
connecon schemes (LeCun et al., 1998).
Convoluonal neural networks (CNNs)
connect neurons in a localized manner,
leveraging spaal hierarchies in data, which
is parcularly eecve for image processing
tasks, using this approach resulted in
successful document recognion tasks by
using local recepve elds and weight
sharing to reduce the number of free
parameters (LeCun et al., 1998). Recurrent
neural networks (RNNs) introduce
connecons over me steps, making them
suitable for sequenal data like text, speech
and other me series. Long Short-Term
Memory (LSTM) networks specically
address the vanishing gradient problem
through specialized memory cells and
gang mechanisms (Hochreiter &
Schmidhuber, 1997). Graph neural networks
(GNNs) allow neurons to be connected
based on arbitrary graph structures,
enabling the processing of non-Euclidean
data. For instance, Graph Convoluonal
Networks (GCNs) have demonstrated
success in semi-supervised classicaon
tasks by eciently propagang informaon
through graph structures (Kipf & Welling,
2017). This builds upon earlier work showing
the importance of selecve informaon
processing, as demonstrated in LSTM
architectures (Hochreiter & Schmidhuber,
1997), A recent development in this lines are
the Extended Long Short-Term Memory or
xLSTM migang some of the previous
issues like speed, memory or normalizaon
(Maximilian Beck, et al 2024)
The design of models and architectures is a
foundaonal aspect and current trends
emphasize the development of lightweight
models that can operate on edge devices
with limited resources. Drawing inspiraon
from early work on ecient architectures
like LeNet (LeCun et al., 1998), techniques
like model pruning, quanzaon, and
knowledge disllaon have been employed
to reduce model size and complexity without
signicant loss in accuracy. Approaches like
GCNs further demonstrate how careful
architectural design can achieve linear
computaonal complexity while
maintaining high performance (Kipf &
Welling, 2017).
Training and inference are two crical
phases in the lifecycle of an AI model. During
training, the model learns paerns from
data by adjusng its parameters to minimize
a loss funcon through techniques like
backpropagaon (Rumelhart et al., 1986).
Inference involves using the trained model
to make predicons or decisions based on
new input data. The eciency and
eecveness of both phases are vital for real
world applicaons.
Advancements in opmizaon algorithms,
such as adapve learning rate methods like
Adam, have accelerated training
convergence by combining the benets of
AdaGrad's ability to handle sparse gradients
with RMSProp's eecveness in non-
staonary sengs (Kingma & Ba, 2014). The
introducon of momentum terms and bias
correcon in opmizaon methods has
helped prevent stagnaon during training
(Kingma & Ba, 2014). For example, Adam's
adapve moment esmaon approach
automacally adjusts learning rates for
each parameter while requiring minimal
hyperparameter tuning (Kingma & Ba,
2014).
Distributed training has become essenal
for handling large datasets and complex
models. The development of ecient
gradient-based methods that can work with
stochasc objecve funcons has enabled
beer scaling of training processes across
mulple devices (Kingma & Ba, 2014).
Furthermore, the discovery that neural
networks can learn useful internal
representaons through proper weight
adjustment techniques, as demonstrated by
Rumelhart et al. (1986), also has been key.
Their work showed how mul-layer
networks can develop internal
representaons that capture important
features of the task domain through the
back-propagaon of errors.
The focus on opmizing both training and
inference connues to be a signicant area
of research, especially with the growing
demand for real-me AI applicaons. Early
breakthroughs in understanding how neural
networks learn representaons (Rumelhart
et al., 1986) have evolved into sophiscated
opmizaon methods that address praccal
challenges in current best deep learning
systems.
2.4. The Families of AIs
The journey for the human brains to the
arcial ones encompasses a diverse array
of approaches, methodologies, and
applicaons, to help organize what we are
used to name Arcial Intelligence, we
needed various taxonomies to categorize
and understand its components. One
fundamental taxonomy divides AI into
narrow (or weak) AI and general (or strong)
AI. Narrow AI refers to systems designed to
perform specic tasks, while general AI aims
to replicate human-level intelligence across
a wide range of cognive tasks (Chollet,
2019). Another common classicaon is
based on the AI system's underlying
approach: rule-based systems, machine
learning, and deep learning (Barredo Arrieta
et al., 2020). Rule-based systems rely on
predened rules and logic to make
decisions. Machine learning algorithms, in
contrast, learn paerns from data without
explicit programming (Nilsson, 1983).
AI can also be categorized by its primary
funcon or applicaon domain. This
includes categories such as natural
language processing, computer vision,
robocs, expert systems, and planning and
decision-making systems (McCarthy et al.,
1955). Each of these domains has its own set
of techniques, challenges, and benchmarks.
From a philosophical perspecve, AI
taxonomies oen consider the system's
cognive capabilies, reecng increasing
levels of sophiscaon and autonomy in AI
systems (Chollet, 2019). Ethical taxonomies
for AI have also emerged, focusing on
aspects such as transparency, fairness,
accountability, and privacy (Barredo Arrieta
et al., 2020). These classicaons help in
assessing the societal impact and
responsible development of AI technologies.
As highlighted by Nilsson (1983), the
maturaon of AI as a scienc eld requires
clear taxonomies to understand "what sets
us apart from adjacent disciplines" and to
establish AI's unique niche within the
broader landscape of intelligent systems.
We also have the dichotomy between
symbolic AI and conneconist AI represents
two fundamental approaches to arcial
intelligence.
Symbolic AI, also known as classical AI or
GOFAI (Good Old-Fashioned AI), is based on
the manipulaon of symbolic
representaons of knowledge. According to
Newell and Simon (1976), physical symbol
systems provide "the necessary and
sucient means for general intelligent
acon," where intelligence emerges from
the manipulaon of symbols and
expressions through dened processes.
Symbolic AI systems use formal logic,
decision trees, and expert systems to
process informaon and make decisions.
They excel in domains where knowledge can
be explicitly encoded. The strength of
symbolic AI lies in its interpretability and
ability to handle complex reasoning tasks.
However, it struggles with tasks requiring
paern recognion or handling uncertainty.
Conneconist AI, on the other hand, is
inspired by the structure and funcon of
biological neural networks. As described by
LeCun, Bengio, and Hinton (2015), deep
learning methods allow computaonal
models to learn representaons of data with
mulple levels of abstracon, discovering
intricate paerns in large datasets. These
systems excel in tasks such as image and
speech recognion, where paerns are
complex and dicult to specify explicitly.
They are parcularly adept at handling noisy
or incomplete data. However, their decision-
making process can be opaque, leading to
challenges in interpretability and
explainability.
The debate between symbolic and
conneconist approaches has evolved over
me. While early AI research was
dominated by symbolic methods, the
resurgence of neural networks led to
signicant advancements in conneconist
AI, with deep learning achieving
breakthrough results in areas like speech
recognion and visual object recognion
(LeCun et al., 2015). Today, many
researchers recognize the complementary
nature of these approaches and seek to
combine them in hybrid systems.
Neuro-symbolic AI, as discussed by Garcez et
al. (2015), aims to integrate the strengths of
both paradigms. The goal of neural-
symbolic computaon is to integrate robust
conneconist learning with sound symbolic
reasoning, combining the paern
recognion capabilies of neural networks
with the logical reasoning of symbolic AI.
This integraon addresses one of the main
challenges of arcial intelligence: the
eecve combinaon of learning and
reasoning (Garcez et al., 2015). This hybrid
approach holds promise for developing
more robust and versale AI systems
capable of both learning from data and
reasoning with explicit knowledge.
2.5. Winters and Summers in the Quest for
AI
As our understanding of the brain and the
essence of human intelligence evolved it
needed several waves of theories,
experiments and validaons, in a centuries
long aempt in trying to understand it, the
history of arcial intelligence has been
characterized by alternang periods of high
expectaons and enthusiasm (summers)
followed by disappointment and reduced
funding (winters). This cyclical paern has
signicantly inuenced the development
and percepon of AI technology (Floridi,
2020).
The rst AI summer began in the 1950s with
the birth of AI as a eld. As described by
Crevier (1993), pioneering work by
researchers at places like MIT's Arcial
Intelligence Laboratory led to opmisc
predicons about AI's potenal, with early
demonstraons including computers
controlling robot arms and manipulang
block structures. However, these early AI
experiments, while impressive to watch,
proved limited to carefully simplied
problems in restricted areas, leading to the
rst AI winter when military funding was
reduced (Crevier, 1993).
The 1980s brought a resurgence of interest
in AI, driven by the commercial success of
expert systems. Crevier (1993) notes that
expert systems were promoted as
specialized AI that could capture human
decision-making processes for narrowly
focused tasks, from medical diagnosis to oil
exploraon site selecon. However, as
Hendler (2008) explains, when the expert
systems market failed, it rekindled interest in
alternave approaches like arcial neural
networks.
The current AI landscape looks like it could
follow this paern. Floridi (2020) warns of
another predictable winter approaching,
arguing that AI has been subject to these
hype cycles because it represents a long-
held hope of creang something that does
everything for us. He cricizes
commentators and "experts" who competed
to tell the "tallest tale," spreading myths
about AI as either an ulmate panacea or
nal catastrophe (Floridi, 2020). Hendler
(2008) suggests that avoiding future AI
winters requires documenng successes,
embracing applied AI rather than disowning
it, and pulling together as a eld while
acknowledging both achievements and
remaining challenges.
The current AI summer, which began in the
early 2010s, between 2012 and 2013,
and has been fueled by advancements in
machine learning, parcularly deep
learning, breakthroughs in areas such as
image recognion, natural language
processing, and AI systems winning at
games like Go, have reignited excitement
about AI's potenal. This period has seen
unprecedented investment in AI research
and applicaons across all industries, and
looks like the predicted new winter by Floridi
is not on the horizon for the moment.
3. The Predicve Brain
Human intelligence, by denion, must be
exible to adapt to circumstances, swi to
reliably manage soluons, and predicve in
ancipang needs. The concept of the brain
as a smulus-response mechanism became
popular thanks to Pavlov’s studies on
reexes and the rise of behaviorism in the
early 20th century, led by Watson and
Skinner. For behaviorism, the brain and
human behavior were reacve, responding
to specic smuli. Behavior could be
modied through incenves, punishments,
and rewards. Although reexes and operant
responses are relevant to understanding
behavior, a more advanced view gained
prominence over me. The brain acvely
creates models and hypotheses about the
world without waing for smuli, predicng
what it will encounter and using sensory
feedback to adjust these hypotheses and
predicons. Percepon is not a passive
boom-up construcon but an acve,
predicve model that operates according to
a Bayesian model of what is most likely to
occur based on prior encoding in memory
and the current environment.
Karl Friston is one of the pioneers of this
perspecve. He introduced the free energy
principle, which posits that the brain
ancipates sensory input through a model
of the world, adjusng the discrepancy
between expectaon and percepon, thus
minimizing free energy, which represents
the uncertainty inherent to any biological
system at a given moment. The greater the
free energy, the larger the discrepancy
between predicon and actual sensory
input. Predicon and predicon error are
expressed in opposite direcons, top-down
and boom-up, respecvely. The system is
thus hierarchical but bidireconal, with
percepon and acon constantly feeding
each other (Friston & Stephan, 2007).
If the intense socializaon of our species was
fundamental for the development of
complex intelligence, this predicve nature
must also apply to social contexts. Theory of
Mind, in which we form ideas about others'
intenons and thoughts, is an acve
inference, ancipang others’ behaviors
based on past experiences, the environment,
and context. If others behave as expected,
predicon error does not arise. Interesngly,
in ausm spectrum disorder, neural
networks associated with the social brain
are altered, emphasizing that our social
cognion is also predicve (Frith & Frith,
1999). Adjustments in our social inferences
about others inuence self-percepon and
future adapve capacity (Frith, 2007).
On an emoonal level, the same principle
applies. Although the reacve view of
emoons has been dominant, primarily due
to proponents of basic emoons theory
(Ekman, 1999), predicon error is a central
mechanism in construcng emoons. When
somac signals or bodily assumpons do
not match predicons, the brain adjusts its
predicon (Barre, 2016). Mental
condions such as anxiety or depression are
also linked to interocepve predicons,
generang exaggerated or maladapve
simulaons about imminent dangers, bodily
alert states, or available resources to cope
with situaons (Seth & Friston, 2016).
Interocepon operates like another sense,
sending informaon to the brain about
internal physiological processes, such as
respiraon, temperature, pH, heart rate,
hunger, thirst, and available energy (Barre
& Simmons, 2015). The insula and anterior
cingulate cortex are the primary
interocepve brain regions, connecng the
body and mind (Craig, 2009). Interocepve
signals ascend via the vagus and spinal
nerves to the brainstem and then to the
insular cortex, where we become aware of
bodily states. Based on this state, we make
decisions and emoonal responses,
involving the anterior cingulate cortex
(Critchley et al., 2004). Hypo- or
hyperacvaon of the insula leads to
interocepve disorders of the bodily “self
(Khalsa et al., 2018; Avery et al., 2014). The
sensaon of a connuous self is also linked
to interocepon (Northo & Panksepp,
2008).
Fig 3. Predicve neurons and error detecon neurons
From the predicve view, what we call
reality is merely an eecve simulaon
constructed by the brain, based on past
experiences and available sensory
informaon. When a simulated reality is
eecve depends on the level of consensus
it reaches and the tasks it enables. In mental
disorders, such as certain psychoses and
schizophrenia, predicons generate
delusions or hallucinaons that fail to
achieve consensus with others or to become
eecve predicons (Hohwy, 2013; Fletcher
& Frith, 2009). Andy Clark extends the
concept of predicon to include acon itself,
proposing that the brain predicts what it
perceives as well as the consequences of the
acons it executes bidireconally (Clark,
2015). Interocepon again plays a crucial
role in predicon. Cognion is thus
embodied, as the brain dynamically
interacts with the body and environment
(Clark, 2016; Barre et al., 2007). Language
does not escape this view; in
communicaon, we ancipate what the
other will convey, inferring or acvely
modifying meanings as we listen, from a
word to a narrave (Lindquist et al., 2015;
Lindquist & Gendron, 2013). Consciousness
would be an outcome of acve predicon,
where the brain displays the best opon
regarding bodily state and sensory input
(Seth, 2014). An exaggerated or impossible
mismatch between predicon and
predicon error causes unusual experiences,
such as dissociaon, out-of-body
experiences, or similar phenomena (Seth,
2021). Acve consciousness allows us to
assess learning processes and improve
behavior over me, favoring reecve
decision-making (Fleming & Frith, 2014;
Dehaene, 2014). Various authors extend the
predicve sense proposed by Friston and
Clark to include the sense of me and self as
a temporal agent. George Northo
proposes that the “selfis a core construct
generated from predicve processes.
Predicon depends on the temporal
synchrony of dierent brain regions through
neural oscillaons (Northo, 2014; Buzsáki,
2006). This internal rhythm of the brain is
independent and precedes any smulus. In a
state of apparent rest, an internal default
mode operates, with the brain in constant
preparaon through neural networks that
synchronize and ancipate possible
responses.
3.1. Encoding and Decoding Intelligence
If we try to nd a correlaon from biology in
arcial systems, encoder-decoder
architectures play a pivotal role in tasks that
involve transforming input data into
dierent formats or representaons. The
encoder processes the input data and
compresses it into a latent representaon,
capturing essenal features, while the
decoder reconstructs or translates this
representaon into the desired output
format (Hinton & Salakhutdinov, 2006). As
demonstrated in the groundbreaking work
by Cho et al. (2014), this architecture is
fundamental in applicaons like machine
translaon, where "one RNN encodes a
sequence of symbols into a xed-length
vector representaon, and the other
decodes the representaon into another
sequence of symbols."
The architecture's eecveness lies in its
ability to jointly train both components to
maximize the condional probability of the
target sequence given the source sequence
(Cho et al., 2014). In dimensionality
reducon applicaons, autoencoders can
learn low-dimensional codes that
signicantly outperform tradional
methods like principal components analysis,
parcularly when inialized through careful
pre-training procedures (Hinton &
Salakhutdinov, 2006). Aenon
mechanisms have enhanced encoder-
decoder models by allowing the decoder to
focus on specic parts of the input during
output generaon. This has signicantly
improved performance in various tasks,
building upon the foundaon laid by early
RNN Encoder-Decoder models where "both
yt and h(t) are condioned on yt-1 and on the
summary c of the input sequence" (Cho et
al., 2014). The exibility and eecveness of
encoder-decoder architectures make them a
cornerstone in the development of advanced
AI systems, capable of learning meaningful
representaons across diverse data types
and tasks (Hinton & Salakhutdinov, 2006).
Variaonal Autoencoders (VAEs) are a class
of generave models that extend tradional
autoencoders by incorporang probabilisc
elements into the encoding process (Kingma
& Welling, 2013). VAEs encode input data
into a latent space characterized by a
probability distribuon, typically a
Gaussian, where the true posterior
distribuon is approximated using
variaonal inference techniques (Rezende et
al., 2014). This approach allows for the
generaon of new data samples by
sampling from the latent space and
decoding the samples back into the data
space. The training objecve of VAEs
includes both the reconstrucon loss and a
regularizaon term in the form of a KL
divergence that encourages the latent
distribuon to match a prior distribuon
(Kingma & Welling, 2013). This balance
enables VAEs to generate diverse and
coherent outputs, making them valuable in
tasks like image synthesis, anomaly
detecon, and data augmentaon. Follow-
up research has shown that carefully
designed neural architectures for VAEs
achieve state-of-the-art results in image
generaon tasks (Vahdat & Kautz, 2020).
VAEs have been combined with other
architectures, such as convoluonal layers
for image data and recurrent layers for
sequenal data, to enhance their generave
capabilies. The development of VAEs
represents a signicant step forward in
unsupervised learning and generave
modeling, oering a principled approach to
learning both the generave model p(x|z)
and recognion model q(z|x) jointly through
the reparameterizaon trick (Rezende et al.,
2014).
The concepts of adding noise and denoising
it later are integral to improving the
robustness and generalizaon of AI models.
Introducing noise during training acts as a
regularizaon technique. For instance,
dropout randomly "drops" units along with
their connecons during training to prevent
units from co-adapng too much, thereby
reducing overng (Srivastava et al., 2014).
Vincent et al. (2008) demonstrated that
corrupng inputs and training models to
reconstruct the original data encourages
learning more robust features, as evidenced
by their work on denoising autoencoders
which showed improved classicaon
performance compared to tradional
autoencoders.
Denoising techniques are employed to
recover original data from corrupted or
noisy inputs. Vincent et al. (2008) explain
that denoising autoencoders learn to map
corrupted examples back to uncorrupted
ones, eecvely capturing stable structures
and dependencies in the input distribuon,
this is important as it forces the model to
learn useful features rather than simply
copying the input. The approach can be
understood from a manifold learning
perspecve, where the model learns to
project corrupted examples back onto the
manifold of natural examples (Vincent et al.,
2008).
Follow up advancements include denoising
diusion probabilisc models, which
generate high-quality data samples by
iteravely rening noise-added inputs. Ho et
al. (2020) demonstrated that these models
can achieve state-of-the-art FID scores on
image generaon tasks by learning to
reverse a diusion process that gradually
adds noise to the data, this frames the
generaon process as learning a sequence
of denoising steps, yielding a new class of
generave models that naturally admit a
progressive lossy decompression scheme
(Ho et al., 2020).
Transformers have revoluonized natural
language processing by introducing self-
aenon mechanisms that capture global
dependencies in data. Unlike tradional
sequenal models, Transformers process
input data in parallel, signicantly
improving training eciency and
performance (Vaswani et al., 2017). This
architecture has been the foundaon for
large language models (LLMs) from BERT,
which introduced bidireconal pre-training
(Devlin et al., 2018), follow-up models like
GPT-3 that demonstrate strong few-shot
learning capabilies (Brown et al., 2020),
and current models (2024) sll share the
essence of transformers like GPT-o3, Claude
3.5, Gemini 2 or LlamA 3.3
LLMs have demonstrated remarkable
capabilies in understanding and
generang human-like text, performing
tasks such as translaon, summarizaon,
and queson-answering with high
prociency. These models are pre-trained on
vast datasets and can be ne-tuned for
specic applicaons, achieving state-of-the-
art results across various benchmarks. For
example, BERT achieved signicant
improvements on eleven NLP tasks (Devlin
et al., 2018), while models like LLaMA have
shown compeve performance with much
greater eciency (Touvron et al., 2023).
The scaling of Transformers and LLMs has
raised important consideraons regarding
computaonal resources and ethical
implicaons. Brown et al. (2020) note the
substanal computaonal requirements for
training large models, while Touvron et al.
(2023) demonstrate eorts to develop more
ecient models. Addionally, research into
migang biases and ensuring responsible
use is ongoing, as highlighted by both Brown
et al. (2020) and Touvron et al. (2023),
emphasizing the importance of aligning AI
advancements with societal values.
3.2. Where Do Predicons Occur in the
Brain?
Within the bidireconal process, predicon
operates top-down, while sensory
informaon and predicon error ow
boom-up. Higher corcal areas are thus
candidates for generang and adjusng
predicons about our body, environment,
and acons. Among these, regions related
to the social brain, which dier the most
from those in other primates, are prime
candidates for being the main players in our
acve predicons and inferences. The
dorsolateral prefrontal cortex evaluates
long-term consequences of future acons
and adapts behavior when a predicon
error arises (Miller & Cohen, 2001; Duncan,
2013). The anterior cingulate cortex
monitors error, detects conicts, and adjusts
predicons between expectaons and
reality (Botvinick et al., 2004), parcipang
in learning and decision-making (Shenhav et
al., 2013). Even in primary sensory or motor
areas, low-level predicons are made about
sensory smuli and motor acons. In the
primary visual cortex, aributes such as
color, orientaon, shape, contrast, and
movement are inferred (Rao & Ballard,
1999). In the primary motor cortex,
ancipated outcomes of motor acons are
planned, allowing real-me movement
adjustments (Keller & Mrsic-Flogel, 2018).
Although specic regions are involved,
operaonal funconing is hierarchical and
networked. The default mode network and
fronto-parietal execuve network are the
most prominent high-level predicve
networks involving some of the structures
menoned. Through the default network,
self-reecon, memory, and future planning
occur, primarily via dynamic communicaon
between the medial prefrontal cortex,
posterior cingulate cortex, and
hippocampus (Northo & Huang, 2017;
Raichle, 2015). The fronto-parietal or
execuve network, linked with intelligence,
includes the dorsolateral cortex, inferior
parietal lobule, and anterior insula, and is
involved in decision-making and cognive
control (Cole, 2014). While the default
network simulates future scenarios, the
execuve adjusts predicons in response to
environmental changes. Slower frequencies
in these networks relate to large-scale
informaon integraon and future state
predicon, while faster ones in primary
corces relate to immediate percepon,
novel smuli adaptaon, and predicon
error correcon (Palva & Palva, 2018;
Friston, 2010; Northo & Huang, 2017).
Local, rapid predicon in primary corces
occurs through corcal columns, the basic
funconal units of the cortex. The corcal
column is distributed across six layers that
implement local predicve coding. When
these columns detect a predicon error, they
send signals upwards for the default or
execuve network to adjust global
predicons. This informaon exchange is
temporally synchronized (Bastos et al.,
2012). For example, if an object approaches
us, layers II and III of the corcal column
ancipate its trajectory; layer IV receives
sensory informaon from the thalamus, and
if there’s a match, predicon error does not
occur. If the object changes direcon, layers
V and VI send predicon error informaon
to higher layers to modify the inference.
Regardless of whether the smulus is visual,
auditory, emoonal, cognive, social, or
interocepve, the upper layers (I, II, and III)
generate predicons, the middle layers (IV)
compare aerent informaon, and the deep
layers (V and VI) adjust predicons (Rao &
Ballard, 1999; Feldman, 2012; Spratling,
2017).
4. Embodied, Enacted and Extended
Brains
Embodied cognion, stemming from the
dynamic interacon between the brain,
body, and environment, enables the
necessary predicons and adjustments to
operate eciently in the world (Clark, 2015;
Clark, 2016; Barret et al., 2007). Francisco
Varela is one of the pioneers of this concept,
introducing, alongside the noon of an
embodied mind, the concepts of extended
mind and "enacted mind." Together with
Humberto Maturana, he developed the
concept of autopoiesis, which posits that
living beings self-organize and adapt in
response to the characteriscs of a dynamic
environment (Maturana & Varela, 1980).
Subsequent neuroscience research
emphasizes the dynamic interplay between
percepon and acon. The brain is not a
passive enty that merely receives
environmental informaon; rather, it
acvely constructs this informaon through
motor simulaon of observed acons
(Gallese & Lako, 2005). In this regard, the
discovery of mirror neurons illustrates how
the brain automacally responds to
observed acons and movements,
impacng the percepon-acon
relaonship. The premotor cortex and
parietal cortex adapt preempvely to the
dynamic social environment (Rizzola &
Craighero, 2004).
A unique characterisc of the relaonship
with the environment is related to how the
brain responds to smuli when the
individual is at rest versus in moon. When
moving, primary somatosensory and
auditory areas reduce their acvity, thereby
prevenng sensory overload. Conversely,
there is an increase in acvity within
mulsensory regions where informaon
from various senses is integrated, allowing
for a more adapve percepon (Suzuki et
al., 2022).
Our cognion also extends into the
environment. The use of devices and
technologies directly impacts our memory,
aenon, emoonal regulaon, and sense
of self. The "Google eect" explains how,
due to easy access to informaon, people
tend to remember where informaon is
located (such as on the internet) rather than
the content itself (Sparrow, Liu, & Wegner,
2011). Regular use of GPS is associated with
reduced hippocampal acvity and a decline
in spaal memory compared to individuals
who recall landmarks or devise mental
navigaon strategies (Münzer et al., 2020;
Javadi et al., 2017). Our cognive prostheses
directly inuence aenon capacity and
long-term memory (Wilmer, Sherman, &
Chein, 2017).
In neurodevelopment, the maturaon of
cognive structures emerges as infants
manipulate objects, thus integrang visual,
somatosensory, and motor informaon,
coordinang dierent senses, and gaining
an understanding of objects’ global
properes (Needham et al., 2014).
Furthermore, acve manipulaon of objects
facilitates faster cognive categorizaon
than passive interacon (Smith & Gasser,
2005).
4.1. AGI
Arcial General Intelligence (AGI)
represents the holy grail of AI research, an
hypothecal AI system capable of
performing any intellectual task that the
average human can. Unlike narrow AI
systems designed for specic tasks, AGI
would possess human-like cognive
abilies, including reasoning, problem-
solving, learning, and adaptability across all
domains (Goertzel & Pennachin, 2007). The
concept of AGI has been a subject of intense
debate and speculaon within the AI
community. Proponents argue that AGI is
not only possible but potenally inevitable,
given the rapid advancements in machine
learning and cognive science, though they
acknowledge that creang AGI is "merely an
engineering problem, though certainly a
very dicult one" (Goertzel & Pennachin,
2007). They envision AGI systems that could
revoluonize scienc research, solve
complex global challenges, and even
augment human intelligence.
However, the path to AGI is fraught with
signicant technical, philosophical, and
ethical challenges. One major hurdle is
developing systems that can transfer
knowledge and skills across dierent
domains, a capability known as transfer
learning. As Lake et al. (2017) argue, truly
human-like AI systems must "harness
composionality and learning-to-learn to
rapidly acquire and generalize knowledge to
new tasks and situaons." Another
challenge lies in imbuing AI systems with
common sense reasoning and
understanding of context, which humans
acquire through years of experience and
interacon with the world. This requires
building "causal models of the world that
support explanaon and understanding,
rather than merely solving paern
recognion problems" (Lake et al.,
2017).The potenal implicaons of AGI are
profound and far-reaching, raising
important ethical and societal quesons.
Issues of control, alignment with human
values, and the potenal existenal risk
posed by superintelligent AI systems are
acve areas of research and discussion in
the eld of AI safety (Bostrom, 2014). As
Bostrom's analysis suggests, the creaon of
superintelligent beings could represent both
a possible existenal risk to mankind and an
opportunity to address this risk through
careful development and control measures.
According to one of the main players in the
AGI race, Open AI (Open AI 2024) there are
5 levels regarding the capabilies of
algorithms to mimic and surpass human
funcons:
Level 1: Conversaonal AI
At the foundaonal stage, Conversaonal AI
focuses on understanding and generang
human language to engage in uid,
meaningful dialogue. These systems are
designed primarily for tasks like answering
quesons, providing informaon, and
assisng users in a structured
conversaonal format. They rely heavily on
language models trained to interpret
context within a limited scope, making them
suitable for customer service, virtual
assistance, and other straighorward,
communicaon-focused roles. Although
highly capable in processing and generang
text, conversaonal AIs are primarily
reacve, responding to user prompts
without the capacity for deep reasoning or
independent thought.
Level 2: Reasoning AI
Building upon the conversaonal abilies of
Level 1, Reasoning AI adds a layer of logical
processing and contextual understanding,
allowing it to analyze problems and deduce
soluons. These AIs can draw inferences and
understand causal relaonships within a
given framework, enabling them to tackle
more complex tasks, such as diagnoscs,
crical thinking, and problem-solving across
structured domains. They are equipped to
go beyond surface-level interacons,
engaging in reasoning to suggest soluons
and make sense of complex informaon. As
a result, Reasoning AI can be employed in
areas like nancial analysis, medical
support, and legal assistance, where deeper
understanding and logical interpretaon are
essenal.
Level 3: Autonomous AI
Autonomous AI represents a signicant
leap, enabling systems to iniate and
complete tasks independently, without
direct human prompts. These AIs possess
the ability to self-direct their acons based
on situaonal requirements and pre-dened
goals, making them capable of navigang
dynamic environments autonomously. They
adapt to new condions in real-me,
making autonomous cars, drones, and
certain roboc applicaons possible.
Autonomous AI introduces systems that can
handle mul-step processes with some
degree of unpredictability, as seen in
logiscs or industrial automaon, where
they operate independently to complete
complex workows while adjusng to
environmental changes.
Level 4: Innovang AI
At the Innovang AI level, systems advance
beyond mere task compleon to generate
original ideas, hypothesize, and innovate
within specied elds. These AIs can
contribute creavely to scienc research,
engineering, and the arts by forming and
tesng novel hypotheses, designing
soluons, or producing new arsc works.
Innovang AI possesses the ability to
analyze paerns and create something new
rather than simply iterang on human
knowledge, showing potenal for
breakthroughs in areas like drug discovery,
material science, and technology
development. This level of AI could reshape
industries by independently pushing the
boundaries of what’s possible, discovering
new pathways that humans may not have
envisioned.
Level 5: Organizaonal AI
Organizaonal AI represents the pinnacle of
integrated AI, capable of managing and
opmizing complex systems, from corporate
operaons to large-scale infrastructure. This
level of AI funcons across various domains,
coordinang processes, strategizing, and
making high-level decisions autonomously.
It could oversee interconnected tasks in a
corporate seng, inuence policy planning,
or orchestrate city-wide infrastructure, with
an understanding of overarching goals and
intricate dependencies. Organizaonal AI
has the potenal to act as a system-wide
manager, opmizing resources and guiding
complex organizaons toward strategic
objecves with minimal human
intervenon, making it highly
transformave but requiring robust ethical
and alignment measures to ensure societal
benet.
Other AGI players draw dierent levels or
approaches but the essence is similar.
5. Qubic’s Aigarth
Qubic (hps://qubic.org) is a decentralized
network, created by one of the leading
gures in the eld, Sergey Ivancheglo,
involved also in several other projects in the
space over the last decade, Qubic central
aim is creang and evolving an AGI from the
beginning.
Qubic’s approach to Arcial Intelligence is
dierent than the tradional lines of
research and development in the eld giving
birth to the concept of Aigarth, “garth”
coming from the old English word for
“garden” or “yard” where the AIs will
develop and grow instead of being fully
dened by human design. One of the rst
dierences is having in mind the role of
limited GPUs in tradional AI approaches.
While many AI projects rely heavily on
powerful GPUs for training and inference, as
evidenced by the increasing deployment of
large accelerator-rich clusters providing
peta- or exa-scale levels of compute (Jain et
al., 2024), Aigarth takes a dierent path.
The project focuses on CPU-based training,
emphasizing eciency and accessibility over
raw computaonal power. This approach
aligns with Aigarth's goal of creang a
decentralized AI system that can run on a
wide range of hardware.
By moving away from GPU dependence,
which has dominated the industry over the
last decade with less than a dozen "GPU
rich" labs and instuons, and most of the
researchers being "GPU poor", Aigarth
opens up possibilies for broader
parcipaon in AI development. This
democrazaon is parcularly relevant
given that current ML workloads
increasingly require massive computaonal
resources, with some models needing
hundreds of thousands of accelerators (Jain
et al., 2024). It allows for a more distributed
network of contributors, leveraging the
collecve power of many standard
computers rather than relying on specialized
hardware. This strategy not only
democrazes AI development but also
potenally leads to more robust and
adaptable AI systems. Avoiding the use of
GPUs in Aigarth also drives innovaon in
algorithmic eciency. As noted by Jain et al.
(2024), the slowing of Moore's Law and end
of Dennard's Scaling has pushed large-scale
systems toward heterogeneous accelerators
to scale performance, especially for ML
workloads. However, Aigarth takes an
alternave approach, to nd novel soluons
that can perform complex computaons
with limited resources. This constraint-
driven innovaon could lead to
breakthroughs in AI eciency that might be
overlooked in resource-rich environments.
In terms of leveraging massive computaon
there are 2 approaches: one leveraging
decentralized distributed compung, like for
example what SETI@home did to harness
collecve computaonal power, or another
approach by using tradional centralized
(cloud) AI pracces that uses big data
centers with increasing power demands to
deal with large amounts of data and
computaonal resources (Wingarz et al.,
2024).
Aigarth uses decentralized distributed
compung, for several reasons, like the need
for greater control over the development
process and to ensure the privacy and
security of the evolving AI systems.
decentralized compung also allowed for
more precise tuning of the Intelligent Tissue
and AI components discussed later. This shi
reects broader industry recognion that
edge compung can help migate privacy
concerns and reduce latency while
enhancing bandwidth ulizaon (Lee et al.,
2024). Aigarth also uses a distributed model,
albeit in a more sophiscated form. The
vision is to have Aigarth-created AIs run as
Qubic smart contracts on a decentralized
network. This approach aligns with
emerging frameworks like the Decentralized
Intelligence Network (DIN), which enables
scalable AI development while preserving
data sovereignty and individual rights
(Nash, 2024). This hybrid approach
combines the benets of local control with
the power of distributed compung,
potenally incorporang human-in-the-
loop mechanisms to ensure responsible AI
development (Dehouche & Blythman, 2023).
A crical aspect in comparing AI
architectures is their eciency across key
performance metrics, we can analyze three
dominant approaches: transformer-based
models, tradional ANNs, and neural-
symbolic systems. Geva et al. (2022)
demonstrated that transformer
architectures exhibit signicant memory
overhead in their feed-forward layers,
showing that these layers consume up to
33% of the model's parameters, resulng in
memory footprints approximately 65%
larger than tradional approaches, though
achieving 15% faster inference speeds
through parallel processing. Homann et al.
(2022) provided detailed analysis of
computaonal requirements for
transformer models, revealing that
compute-opmal training requires scaling
laws following a power law between model
size and training compute, with opmal
model sizes scaling as V^(0.74) with the
training compute budget, resulng in
approximately 33% higher computaonal
demands compared to tradional
architectures. In contrast, Garcez et al.
(2019) showed that neural-symbolic
integraon can provide ecient knowledge
representaon and reasoning capabilies
while maintaining lower computaonal
requirements, demonstrang a 25%
reducon in compute needs and 40%
improvement in memory eciency
compared to pure deep learning
approaches, though with a 10% reducon in
inference speed due to reasoning overhead.
Given the early stage of Aigarth's
development, it is premature to make direct
performance comparisons with these
established architectures, Figure 4 presents
empirically veried comparisons between
these three architectural approaches based
on peer-reviewed benchmarks, focusing on
compute requirements, memory usage, and
inference speed, with all metrics normalized
to tradional ANN performance (baseline =
100). The data shows clear trade-os
between computaonal eciency, memory
ulizaon, and processing speed across
dierent architectures. Future work will be
needed to establish rigorous comparisons as
Aigarth matures, parcularly in validang
its projected performance characteriscs
against these established benchmarks,
followup revisions of the paper will include
these comparable benchmarks, though
preliminary projecons can be considered,
since it be a evoluon of Neuro-symbolic
with a reducon in computaonal
requirements.
Fig. 4. Comparave analysis of computaonal eciency metrics across major AI architectural
approaches, Aigarth, are esmated as an evoluon of the Neural-Symbolic approach.
Aigarth's focuses on building an adapve
approach to AI development, seeking the
most eecve method for each stage of
evoluon, while addressing crical
challenges in security, privacy, and
scalability that are inherent in distributed AI
systems (Wingarz et al., 2024).
The main 3 pillars of Aigarth (in contrast to
other approaches) are:
1.- A discrete mathemacal formulaon
called Intelligent Tissue to encapsulate
proven building blocks of neural
computaon.
2.- An unprecedented computaonal power,
with millions of cpu cores focused on a
Ternary Compung” approach, including a
novel use of the third state.
3.- Nature inspired to enable the growth of
Evoluonary Dynamics exploring in a
systemac way the “unknown” “unknowns”
sll ooding and liming the AGI contenders
looking for potenal soluons to build
smarter AIs.
5.1. Intelligent Tissue to build
“Intelligence”
Aigarth's approach to replicate intelligence
instead of being programmed to solve
specic problems, aims to create AIs that
can autonomously develop problem-solving
capabilies using dierent paths, the recent
large language models have demonstrated
already capabilies in percepon,
reasoning, decision-making, and very
limited self-evoluon (Gao et al., 2023). In
Aigarth this is achieved through a process
that mirrors biological evoluon, drawing
inspiraon from natural selecon principles
(Darwin & Wallace, 1858).
The “Intelligent Tissue”, is a set of
interconnected neurons, discovered in the
rst year of the project, represenng a
complex network of arcial neurons and
synapses, operang under principles similar
to those observed in biological neural
networks, which exhibit both structural and
funconal modularity (Jacobides et al.,
2021), which is then shaped and rened
through countless iteraons. AI modules are
created from this ssue, each with the
ability to modify its own structure; those
that successfully solve problems survive and
evolve, while those that fail are discarded.
This "survival of the est" approach
ensures that only the most capable AIs
progress, analogous to how Selecon-
Inference frameworks allow for the
evoluon of logical reasoning capabilies
through iterave renement (Creswell et al.,
2022),.the evoluonary dynamics of this
ssue are key to understanding Aigarth's
potenal for creang truly adapve and
intelligent systems.
Intelligent Tissue evolves through a process
that mirrors biological evoluon, where
resource constraints and environmental
pressures shape network organizaon
(Béna & Goodman, 2024). The ssue starts
with a basic structure and undergoes
connuous modicaons. These
modicaons occur at the level of individual
neurons and synapses, with changes in
connecons and signal delay parameters
driving the evoluon of the ssue, similar to
how Ornstein-Uhlenbeck processes can
guide parameter adaptaon in neural
networks (Garcia Fernandez et al., 2024).
What makes this process unique is its self-
directed nature. The ssue evolves not
based on predetermined rules or connuous
human intervenon, but through a process
of trial and error guided by the eecveness
of the resulng structures in solving
problems. This mirrors how biological
networks opmize for both metabolic
eciency and informaon processing
capabilies (Béna & Goodman, 2024).
Successful modicaons are retained and
built upon, while ineecve ones are
discarded. This evoluonary approach
allows for the emergence of complex,
intelligent behaviors from relavely simple
components. The system's organizaon
emerges from the interplay between
resource constraints and task demands
(Garcia Fernandez et al., 2024),
represenng a boom-up approach to AI
development that has the potenal to
create systems with capabilies far beyond
what could be explicitly programmed. This
approach aligns with current understanding
of how both biological and arcial neural
networks develop specialized funcons
through dynamic adaptaon processes
(Jacobides et al., 2021).
Aigarth's main focus is on general problem-
solving abilies rather than narrow, task-
specic skills. The goal is to create AIs that
can adapt to new, unforeseen situaons and
generate novel soluons, beyond how
modern LLM agents can exhibit adapvity
and heterogeneity across dierent domains
(Gao et al., 2023). This approach could lead
to AIs capable of tackling complex, real-
world problems that current AI systems
struggle with. Aigarth's problem-solving
approach is designed to be transparent and
as explainable as possible, addressing one of
the key challenges in current AI
development, much like how Selecon-
Inference frameworks provide interpretable
traces of reasoning steps (Creswell et al.,
2022).
5.2. From Bytes to Bits - Ternary Compung
Aigarth's approach to compung marks also
a signicant shi from tradional “binary
systems to a ternary paradigm. This move
from bytes back to bits, or more accurately,
to trits, a fundamental trait to the project's
innovave approach to AI development,
building on recent advances in ternary
compung systems (Chen & Lu, 2021).
In Aigarth's ternary system, each unit of
informaon can have three states: TRUE,
FALSE, or UNKNOWN. This ternary logic
allows for a more nuanced representaon of
informaon, similar to how {0, ±1}-ternary
codes have been shown to outperform
tradional binary codes in deep learning
applicaons (Chen & Lu, 2021). The
UNKNOWN state is parcularly crucial, as it
can represent various condions such as
input noise, unnished tasks, or genuine
uncertainty, aligning with Zakrisson's (2024)
observaon that ternary approaches can
eecvely handle missing or uncertain data
without making assumpons about the
missing informaon. This trenary approach
is not just a theorecal concept but is deeply
integrated into Aigarth's architecture. Each
arcial neuron in the Intelligent Tissue
operates on this ternary principle, allowing
for more complex and nuanced informaon
processing. This approach was a precursor
to developments in ecient AI systems, such
as the BitNet framework, which has
demonstrated signicant improvements in
both performance and energy eciency
through ternary representaons (Wang et
al., 2024).
The shi to ternary compung also oers
potenal advantages in terms of energy
eciency and computaonal density.
Recent research has shown that ternary
systems can achieve substanal energy
savings, with reducons of up to 70% in
energy consumpon compared to
tradional approaches (Wang et al., 2024),
aligning with Aigarth's goal of creang
more sustainable and scalable AI systems.
5.3. Evoluonary Dynamics
One of the most interesng aspects of the
arcial neural systems is their potenal for
self-modicaon capabilies. Research has
demonstrated how even simple
computaonal substrates can give rise to
self-modifying programs without explicit
programming (Agüera y Arcas et al., 2024).
This emergent behavior has been observed
across various programming languages and
environments, where programs can modify
both themselves and their neighbors based
on their own instrucons. The process of
self-modicaon in neural networks
typically involves two key components:
synapc plascity and neuromodulaon
(Schmidgall et al., 2023). Plascity in the
brain refers to the capacity of experience to
modify the funcon of neural circuits. This
plascity can operate on dierent
mescales, from short-term adaptaons
lasng milliseconds to minutes, to long-term
changes that persist for extended periods.
What's parcularly innovave about self-
modicaon in arcial systems is that it
can arise spontaneously through
interacons and self-modicaon, rather
than requiring explicit tness funcons or
predetermined goals (Agüera y Arcas et al.,
2024). Self-replicators tend to arise in
computaonal environments lacking any
explicit tness landscape, suggesng that
self-modicaon capabilies may be an
emergent property of certain types of
computaonal systems. The implicaons of
self-modicaon extend to the
interpretability and safety of AI models.
Work on large language models has focused
on understanding how dierent components
of neural networks encode and modify
informaon (Templeton et al., 2024). This
research suggests that as models scale up,
maintaining interpretability of self-
modicaon processes becomes
increasingly crucial for ensuring safe and
predictable behavior.
The self-modicaon capabilies observed
point toward the possibility of creang more
adapve and autonomous AIs, however
there remain fundamental dierences
between ANNs' operang mechanisms and
those of the biological brain, parcularly
concerning learning processes (Schmidgall
et al.,2023). Understanding and bridging
these dierences remains a key challenge.
Exisng evoluonary neural architecture
search (NAS) methods, such as NEAT
(NeuroEvoluon of Augmenng Topologies)
either with CPUs or current GPU approaches
(Lishuang Wang et al.,2024), also rene
network topologies through iterave
selecon and mutaon. However, unlike
NEAT, which typically evolves purely binary
or real-valued connecon weights, Aigarth
employs a ternary compung paradigm that
includes an explicit “UNKNOWN” state for
represenng uncertainty. This tri-state logic,
coupled with Aigarth’s decentralized and
cryptographically seeded CPU-based
approach, diverges from tradional GPU-
centric pipelines by enabling asynchronous
neuron updates, reproducible randomness,
and self-modifying connecons on
commodity hardware. Consequently,
while NEAT focuses on adapve topology
growth with CPU eciency, Aigarth’s
algorithmic design hinges on a more ne-
grained evoluonary process, in which
paral or incomplete informaon can be
handled gracefully without centralized
supervision or convenonal binary
constraints.
Aigarth’s evoluonary process implements a
clear, ered framework of mutaon,
crossover, and adapve selecon thresholds
to exploit the ternary logic fully. As
demonstrated by the “ComputeScore”
roune in Appendix B, each candidate
network is rigorously evaluated for its
output-to-input reconstrucon accuracy, like
in an autoencoder, creang a transparent
selecon environment. In each generaon:
Mutaon: Every neuron in the “Intelligent
Tissue” is subject to a xed mutaon
probability
𝑝𝑚
when triggered, the
neuron’s ternary state (TRUE, FALSE, or
UNKNOWN) undergoes a controlled change,
ensuring a measured injecon of
randomness. This approach parallels
canonical genec algorithms (Goldberg,
1989), with the disncon that Aigarth
rotates among three possible neuron states
rather than two, expanding the evoluonary
search space.
Crossover: High-scoring networks are paired
and recombined, merging subsets of
neurons and synapc connecons to form
ospring architectures. This recombinaon
process, inspired by neuroevoluonary
techniques such as NEAT (Stanley &
Miikkulainen, 2002), leverages
cryptographically seeded randomness so
that both the resulng topology and the
distribuon of ternary states remain
reproducible and fair.
Adapve Selecon Threshold: Epoch-specic
“soluon thresholds” dynamically adjust the
fracon of networks allowed to survive and
replicate. Candidates surpassing the
threshold (or those exhibing meaningful
improvements in reconstrucon accuracy)
progress unimpeded, while lower-scoring
ones face deacvaon or further
modicaon. This ensures the perpetual
renement of top-er soluons, aligning
with Aigarth’s broader vision of evolving
general problem-solving capabilies.
By unifying these operators in a ternary logic
framework, Aigarth maintains a robust
balance between exploraon and
exploitaon. Mutaon prevents premature
convergence by regularly introducing novel
states; crossover inherits and recombines
successful substructures; and adapve
thresholds systemacally winnow
subopmal conguraons while preserving
genuinely innovave traits. This synergy of
evoluonary operators, disncvely
adapted to Aigarth’s ternary paradigm,
drives the self-modicaon and survival-of-
the-est process at the heart of the
system.
The current Aigarth architecture is not fully
implemented to review systemacally all the
potenal evoluonary steps that will take
place, and this is set to be explored in
followup revisions of the paper once all
these processes are measurable and
comparable.
5.4. Towards Self-Awareness
The concept of self-awareness in AI is a
complex and evolving eld of study that
seeks to understand how arcial enes
can possess a form of self-recognion and
introspecon (John Achterberg et al., 2024).
This exploraon is crucial for advancing AI
systems that can beer interact with
humans and their environments. Self-
awareness in AI refers to "the ability of a
system to recognize its own existence and
state" and involves three key aspects:
recognion of self, introspecon, and
adaptaon (John Achterberg et al., 2024).
Rather than trying to explicitly program self-
awareness, modern approaches view it as
an emergent phenomenon, according to
Esmaeilzadeh et al. (2021), consciousness
and self-awareness in AI require "at least
two AI agents capable of communicang
within a given environment to foster the
creaon of an AI-specic language." This
suggests self-awareness may develop
naturally through interacon rather than
direct programming.
Self-awareness is evaluated through several
key indicators including: "self-recognion -
the ability to idenfy itself in a mirror test or
similar scenarios; goal seng - establishing
objecves based on internal states and
external condions; feedback mechanisms -
ulizing feedback to adjust acons and
improve performance; and emoonal
simulaon - mimicking emoonal responses
to enhance interacon with humans" (John
Achterberg et al., 2024). This framework
provides a structured approach to
understanding and measuring AI self-
awareness development. The relaonship
between self-modicaon capabilies and
self-awareness is parcularly important.
Esmaeilzadeh et al. (2021) propose that for
consciousness to emerge, "AI agents must
communicate their internal state of me-
varying symbol manipulaon through a
language that they have co-created." This
aligns with theories from cognive science
and neuroscience, parcularly the Global
Workspace Theory (GWT), which suggests
that consciousness arises from the
integraon of informaon across various
cognive processes (John Achterberg et al.,
2024).
Similarly, Spaotemporal theory of
consciousness posits that self awareness
integrates neural dynamics and subjecve
experience, emphasizing the interplay
between the brain’s resng-state acvity
and self-related processing. Self-relevance
priorize smuli linked to the self,
modulated by the default mode network
(DMN) (Northo & Bermpohl, 2004). Self-
specicity involves neural specializaon for
self-referenal tasks, so self-awareness
remains constant even at rest (Northo,
2014). Temporospaal integraon aligns
intrinsic neural acvity with external
environmental smuli, therefore enabling
the temporal and spaal coherence of self-
awareness (Northo, 2018). Self-other
disncon between the self and external
enes, anchors self-awareness in both
social and personal contexts (Northo,
2011). This model situates self-awareness as
emerging from the dynamic interplay of
intrinsic brain acvity and external
relevance.
However, developing self-aware AI poses
signicant challenges. Current AIs face
technical limitaons in achieving the depth
of understanding required for true self-
awareness, and establishing reliable
methods to assess self-awareness remains a
crical hurdle (John Achterberg et al., 2024).
As Srinivasa et al. (2022) point out, there are
"fundamental issues with the way
'intelligence' is dened and modeled in
present day AI systems," making it an open
queson whether an AI would experience
subjecve consciousness in the way humans
do, or whether its self-awareness would
manifest in fundamentally dierent ways. In
the Aigarth framework, self-awareness is
not seen as a binary state that an AI either
has or does not have. Instead, it's viewed as
a spectrum of capabilies that evolve over
me. As AIs develop more complex internal
models of their environment and their own
funconing, they may naturally develop
something akin to self-awareness. The self-
modicaon capabilies of Aigarth AIs play
a crucial role in this process. As an AI learns
to modify its own structure and behavior, it
necessarily develops a kind of self-model.
This self-model, combined with the AI's
ability to observe the results of its acons,
could lead to a form of self-awareness.
However, it's important to note that this
form of self-awareness may be quite
dierent from human self-awareness. It's an
open queson once Aigarth is completed
whether it would experience subjecve
consciousness in the way humans do, or
whether its self-awareness would be more
akin to a highly sophiscated self-
monitoring system.
5.5. True AI
The concept of "True AI" represents a
paradigm shi towards arcial general
intelligence or arcial consciousness that
can match or exceed human-level
capabilies (Li, 2018). While current AI
achievements mainly simulate intelligent
behavior on computer plaorms and belong
to "weak AI," True AI aims to develop
systems with genuine understanding,
intenonality, mind and consciousness
(Pontes-Filho & Nichele, 2020).
Aigarth's approach to achieving True AI
diers from convenonal methods as
explored earlier by focusing on evoluonary
and bio-inspired frameworks rather than
directly mimicking human intelligence. This
aligns with research showing that successful
arcial systems oen succeed when they
stop imitang biological systems and
develop their own paradigms - just as the
Wright brothers achieved ight through
aerodynamics rather than copying birds (Li,
2018). By starng with simple components
(Intelligent Tissue) and enabling evoluon
through environmental rewards and self-
learning, Aigarth aims to create AI that can
truly learn and adapt without supervision.
Key characteriscs of Aigarth's vision for the
future of True AI include:
1.- General problem-solving abilies across
diverse environments (Pontes-Filho &
Nichele, 2020)
2.- The ability to learn and self-adapt during
runme without explicit programming
3.- Self-improvement capabilies through
evoluonary processes
4.- Potenal for creavity and novel idea
generaon
5.- The ability to reason about abstract
concepts and ground symbols in real-world
experience (Li, 2018)
6.- Possible emergent properes like self-
awareness and consciousness
While achieving True AI remains an
ambious goal, Aigarth's unique approach
combining evoluonary algorithms, ternary
compung, and decentralized development
oers a promising direcon. As Lee et al.
(2024) note, decentralized AI architectures
that allow for permissionless parcipaon
and distributed processing may be crucial
for developing more robust and trustworthy
AI systems.
5.6. The Qubic Aigarth Scoring Algorithm
To ground the Aigarth theorecal
framework detailed earlier, we explored a
full year codebase and some execuon
results, shared with us by the Qubic
development team, to detail its architecture
and explore the expected evoluonary paths
and constraints.
At the heart of it, in the current phase of
development, that is the foundaon for the
discovery of candidates for the “Intelligence
Tissue” Aigarth employs a determinisc
scoring algorithm that evaluates how well
the network reconstructs input paerns at
its output layer. Input ternary paerns are
fed into the input layer, and aer a series of
asynchronous updates governed by
determinisc pseudo-random connecons,
the output layer is compared to the input.
The score is the number of matching
neurons. Thresholds, dynamically adjusted
over epochs or weeks of compute, in the
case of Qubic every week is a epoch, this
determine whether a parcular
conguraon is deemed good” or “bad,
driving selecon, the system ulizes three
key cryptographic components: a
mining_seed (providing epoch-specic
randomizaon), a public_key (idenfying
compung nodes), and a nonce (enabling
soluon space exploraon) to ensure both
reproducibility and computaonal fairness
(Creswell et al., 2022).
The scoring process begins with the
inializaon of a neural architecture
comprising input neurons, hidden units, and
output neurons. Network connecvity is
established through synapc connecons
per neuron, with connecon paerns
determiniscally generated using
KangarooTwelve cryptographic hashing.
Crical to the design, the mining_seed is
updated every internal interval + external
interval cks, ensuring periodic network
reconguraon while maintaining
reproducibility within epochs.
The evaluaon mechanism employs a
sophiscated mul-step opmizaon
process:
1.- Inial network state computaon using
the complete neuron set
2.- Iterave opmizaon through a number
of step phases
3.- Selecve neural acvaon skipping
based on cached state analysis
The score is determined by counng
matching input-output neuron pairs, with
values normalized to the range [0, number
of inputs]. A soluon is considered valid
when exceeding the epoch-specic
SOLUTION_THRESHOLD (Supplementary-A),
which dynamically adjusts to maintain
computaonal diculty
Performance opmizaon is achieved
through parallel processing across
NUMBER_OF_SOLUTION_PROCESSORS or
cores, with task distribuon managed via a
queue-based system. The implementaon
ulizes AVX512 or AVX2 vector instrucons
for ecient neural state updates and
scoring computaons.
To ensure reproducibility while prevenng
gaming of the system, each soluon aempt
is uniquely idened by the triple
(public_key, mining_seed, nonce), with
results cached using a ScoreCache
mechanism of SCORE_CACHE_SIZE entries.
This design allows for ecient vericaon
while maintaining the system's
cryptographic security properes.
Because all random number generaon is
seeded cryptographically, experiments are
reproducible while maintaining a search
space too large to guess outcomes trivially.
Over mulple epochs (currently studied
between EPOCH 83 to EPOCH 140), the
system’s parameters, such as input size,
hidden layer conguraon, and soluon
thresholds, have been tuned adapvely.
These changes reect evoluonary
“pressure, pushing the Intelligent Tissue
toward architectures that are more
memory-ecient, require less data, or meet
stricter soluon criteria. The observed
stepwise changes in DATA_LENGTH,
SOLUTION_THRESHOLD and neuron
conguraons suggest systemac
opmizaon under evoluonary pressure,
analogous to pruning and renement in
developing biological neural networks
(Jacobides et al., 2021).
The current implementaon of the Aigarth
scoring and inializaon procedure,
including parameter sengs and
cryptographically seeded connecon
generaon, is available on GitHub
(hps://github.com/qubic/core). The
repository documents the stepwise
conguraon changes across epochs,
enabling independent vericaon and
replicaon of experiments.
6. Preliminary Comparisons and Future
Direcons
While Aigarth is in early development,
preliminary analyses of the system’s
computaonal footprint and performance
suggest that it may achieve more resource-
ecient operaon than convenonal large-
scale architectures. The ternary approach
and sparse connecvity are hypothesized to
reduce the energy and memory costs
compared to dense ANN or transformer-
based architectures, though future
empirical work is needed to validate these
claims rigorously (Garcez et al., 2009;
Homann et al., 2022).
In upcoming evaluaons, Aigarth’s
scalability and adaptability will be tested on
benchmark tasks, comparing performance
with classical ANN and transformer models
under equal CPU-only resource constraints.
Addionally, evolving the Intelligent Tissue
across diverse tasks may demonstrate
domain-agnosc problem-solving abilies,
moving closer to an AGI scenario (Goertzel &
Pennachin, 2007; Gao et al., 2023).
By adopng an evoluonary, decentralized,
and ternary framework, Aigarth sets the
stage for more accessible and potenally
more adapve AI systems. This ongoing
work will require systemac benchmarking,
rigorous hyperparameter studies, and
transparent reporng of performance
metrics to substanate Aigarth’s claim as a
viable path toward general and potenally
self-aware AI.
To track evoluonary lineage and
improvements over me (Nash, 2024),
future releases will incorporate extended
metrics (beyond reconstrucon scores),
once the neural architectures extend their
capabilies, and to keep exploring the
evoluon stages of Aigarth over the
following epochs, translated to months over
2025 and beyond, tracking the possible
trajectories towards AGI, something that is
sll speculave across all scienc literature
since it has not be achieved yet nor
documented at the me of the wring of this
paper, so the work will be connued
following closely next Qubic’s AI steps.
7. Brain-Inspired Systems: Towards Safe
and Ethical AGI
AI systems must ensure they are benecial
to humanity. Current risks include
systemac AI biases, AI hallucinaons,
privacy concerns, fake content generaon,
and populaon surveillance (Peterson &
Homan, 2022; Bostrom, 2014; Chrisan,
2021).
Decentralized systems powered by
blockchain technology enhance AI security
by providing tamper-proof data integrity,
transparent decision-making processes, and
improved resilience against single points of
failure (Xu et al., 2021).
The next generaon of embodied systems,
such as humanoid robots, autonomous
vehicles, drones, virtual assistants or AI
sciensts, hold the potenal to deliver
transformave societal benets (Bengio et
al., 2021). However, concerns about their
malicious use for military applicaons and
the potenal loss of control over advanced
AI agents remain signicant challenges
(Brundage et al., 2018; Taddeo & Floridi,
2018).
To avoid or migate these risks without
compromising benets, AI systems can
emulate biological mechanisms to respond,
adapt, and predict environmental changes
and uncertaines more accurately.
Brain-based representaons enhance AI
systems' ability to adapt to human contexts
by modeling percepon and contextual,
hierarchical informaon processing using
principles like predicve coding. These
representaons demonstrate robustness to
unknown inputs and enable analysis of
ambiguous situaons and scenarios (e.g.,
atypical symptoms in medical diagnosis or
unclear trac signals in autonomous
driving) by integrang current sensory
inputs with previously stored informaon.
Digital twins play a crucial role by simulang
the consequences of potenal acons in the
environment. These simulaons allow
systems to learn and adapt to norms
(implicit or explicit), signicantly improving
their ability to align with human knowledge
even in environments where novel situaons
do not match training data or are not
explicitly ancipated in the inial system
design (Rao & Ballard, 1999; Friston, 2010).
This capability is crical for addressing
problems and reducing unexpected or risky
behaviors.
As AI systems become increasingly
integrated into crical decision-making
processes, ethical concerns must guide their
design, deployment, and regulaon. These
systems must priorize fairness,
accountability, and transparency to avoid
spreading exisng societal inequalies
(Taddeo & Floridi, 2018). Addionally,
ensuring informed consent in applicaons
with sensive personal data is paramount to
respect individual autonomy and privacy
(Peterson & Homan, 2022).
Aligning AI with human values involves
embedding ethical frameworks into their
operaonal principles. This includes the
need to understand AI decisions
interpretable by humans. Ethical dimension
should also address the unintended
consequences of AI systems (i.e. job
displacement, energy consumpon, unequal
access, environmental impact), emphasizing
the need for socially sustainable soluons
(Chrisan, 2021).
Ulmately, ethical AI development requires
a muldisciplinary approach, involving
collaboraon between technologists,
ethicists, policymakers, and the public to
balance innovaon with societal well-being.
These advancements not only reinforce the
robustness of AI systems but also enhance
their interpretability, directly addressing
challenges of specicaon and safe
monitoring in advanced autonomous
systems (Mineault et al., 2024).
8. Conclusions
Human intelligence, driven by the
hierarchical and adapve architecture of the
brain, deeply guides the development of
arcial general intelligence (AGI).
Neuroscienc models, based in predicve
coding and biological self-
organizaon, provide the key principles for
designing more robust, secure, and ethically
aligned AI systems.
Rooted in this context, Aigarth, emerges as
a pioneering approach combining
biologically inspired innovaons with
advanced technologies such as ternary
compung and decentralized CPU-based
systems. Aigarth proposes an "Intelligent
Tissue" architecture that simulates brain-
like self-organizaon and facilitates
adaptability to complex and new
environments.
Furthermore, the decentralizaon and
democrazaon of AI development
envisioned by Aigarth addresses key
challenges related to control, transparency,
and sustainability while facilitang progress
toward a more accessible and ethical AGI.
We demonstrate how integrang
neuroscienc principles with innovave
computaonal approaches can unlock new
possibilies to overcome current limitaons,
bringing us closer to a safe, aligned, and
transformave AGI for society.
Supplementary Informaon
A. Aigarth Network Architecture
The Network Architecture reviewed for this
paper is:
Input layer: n₁ neurons
(DATA_LENGTH)
Hidden layer: n₂ neurons
(NUMBER_OF_HIDDEN_NEURONS)
Output layer: n₁ neurons (same as
input)
Each neuron connects to k nearest
neighbors
(NUMBER_OF_NEIGHBOR_NEURO
NS)
Values are -1, 0 or 1
Uses sparse connecvity with
neighbor-based topology
The key characteriscs that make this
algorithm unique are:
Asynchronous updates - only one
neuron updated per mestep
Sparse connecvity based on
neighbor topology
Each neuron is connected to k
randomly selected neurons from all
previous neurons
Connecons are determiniscally
generated based on (publicKey,
nonce) pair
Ternary weight s (+1,0 or -1)
determined by hash funcon
Clamped acvaon to prevent value
explosion
Score based on input reconstrucon
at output layer
All random number generaon is
determiniscally seeded using
cryptographic hashing
(KangarooTwelve) of the input keys,
ensuring reproducibility while
maintaining unpredictability of
outcomes.
Through November 2023 to December 2024,
the me frame used for the study in this
paper, the parameters used to test dierent
network architectures has changed. This
period includes 57 main changes from
EPOCH 83 (Nov 15th, 2023) to EPOCH 140
(Dec 18th, 2024). Empirical analysis of
Aigarth's development reveals three
signicant architectural transions that are
aligned with evoluonary adaptaon in 3
areas:
Data Processing Opmizaon: The
DATA_LENGTH parameter shows a clear
stepped reducon from 2000 to 1200, and
nally stabilizing at 256 aer EP99. This
reducon by 87.2% suggests an
opmizaon towards more ecient
informaon processing, potenally aligning
with biological systems' principle of
minimizing metabolic costs while
maintaining funconal capacity.
Soluon Space Dynamics: The
SOLUTION_THRESHOLD exhibits a notable
four-phase transion: an inial high
threshold of about 55.70%, followed by an
intermediate plateau 57.50% , and
stabilizing for several epochs at a lower
range at 16% of the reconstructed neurons
needed for valid soluons, with a nal
signicant increase to 52.73%, of the inputs
to be correctly reconstructed. This non-
linear progression suggests ongoing
adaptaon in problem-solving strategies,
with the latest small increase potenally
indicang a shi towards more stringent
soluon criteria, since the needed accuracy
is increased again.
Fig 5. Neuron reconstrucon targets over the 1 year of Aigarth.
Neural Architecture Evoluon: The
network's neural conguraon peaked at
32,768 hidden neurons, 128 mes the input
neurons, with several architectural shis,
combined with a 70% reducon in neural
elements, could suggest an opmizaon
towards more ecient informaon
processing structures, analogous to
biological neural network renement during
development.
Fig 6. Neural Architecture volume
These transions collecvely indicate a
systemac evoluon towards
computaonal eciency, with each
parameter adjustment potenally
represenng an opmizaon step in the
system's problem-solving capabilies. This
empirical evidence supports Aigarth's
design principle of emergent opmizaon
through evoluonary pressure, rather than
predetermined architectural constraints,
having in mind that this is only the rst step
in the process to create the intelligent
ssue menoned earlier.
B. Example Aigarth Code Abstracon
The current code is available on GitHub. The following is an abstracon of the scoring and
inializaon code:
Input: public_key P, mining_seed S, nonce N
Parameters:
n₁ = 256 {Input/output layer size (DATA_LENGTH)}
n₂ = 3000 {Hidden layer size (NUMBER_OF_HIDDEN_NEURONS)}
k = 3000 {Neighbors per neuron (NUMBER_OF_NEIGHBOR_NEURONS)}
T = 9000000 {Timesteps (MAX_DURATION)}
L = 1 {Value limit (NEURON_VALUE_LIMIT)}
Output: score
[0,n₁]
1: procedure ComputeScore(P, S, N)
2: V ← array[n₁ + n₂ + n₁] {Neuron values, all inialized to 0}
3: V[0:n₁] ← S {Inialize input layer with mining seed data}
4: {Note: S already contains ternary values {-1,0,+1}}
5:
6: {Generate determinisc connecon pool}
7: seed_data ← [P, S, N] {Concatenate inputs}
8: R ← KangarooTwelve(seed_data, 32) {Generate 32-byte random seed}
9: pool ← array[2*M] {M = pool size, store neuron+connecon pairs}
10: for i
[0,M) do
11: poolData ← DeterminiscRandom(R) {Using KangarooTwelve}
12: neuronIndex ← n₁ + (poolData mod (n₂ + n₁)) {Target neuron}
13: neighborOset ← (poolData ÷ (n₂ + n₁)) mod k
14: if neighborOset < k/2 then
15: sourceIndex ← (neuronIndex - k/2 + neighborOset) mod (n₁ + n₂ + n₁)
16: else
17: sourceIndex ← (neuronIndex + 1 - k/2 + neighborOset) mod (n₁ + n₂ + n₁)
18: pool[i] ← (neuronIndex, sourceIndex, signBit) {Store with random sign bit}
19:
20: {Main simulaon loop}
21: x ← 0 {Determinisc RNG state}
22: for t
[0,T) do
23: idx ← x mod poolSize {Select connecon from pool}
24: (target, source, sign) ← pool[idx]
25: delta ← sign ? V[source] : -V[source]
26: V[target] ← V[target] + delta
27: V[target] ← clamp(V[target], -L, +L)
28: x ← x * 1664525 + 1013904223 {LCG parameters from score.h}
29:
30: {Compute score}
31: score ← 0
32: for i
[0,n₁) do
33: if V[i] = V[n₁ + n₂ + i] then {Direct ternary comparison}
34: score ← score + 1
35: return score
36: procedure IsGoodScore(score, epoch)
37: threshold ← (epoch < MAX_EPOCH) ? soluonThreshold[epoch] :
SOLUTION_THRESHOLD_DEFAULT
38: if threshold > DATA_LENGTH or threshold ≤ 0 then
39: threshold ← SOLUTION_THRESHOLD_DEFAULT
40: return score ≥ (n₁/3 + threshold)
score ≤ (n₁/3 - threshold)
Execuon Flow:
1.- Inialize network with binary input paern
2.- Generate determinisc random connecons based on (publicKey, nonce)
3.- Execute T random updates following determinisc sequence
4.- Compare output layer with input layer to compute score
5.- Score represents how many output neurons match input neurons
Sample Execuon:
1.- Input:
publicKey = 0x7B...A2
nonce = 0x4B...F1
miningData = [+1, 0, -1, 0, +1, ..., -1] # 256 ternary values
2.- Execuon:
- Generated 3,000 random connecons per neuron
- Performed 9M state updates
- Final output layer values: [+1, 0, -1, 0, +1, ..., -1]
3.- Sample Score Calculaon:
- Matching neurons: 178
- Final score: 178 (69.5% match)
Complete and updated source code is at: hps://github.com/qubic/core
This code abstracon and input parameters corresponds to EP140 Release v1.230.0
References
Abraham Nash (2024) Decentralized
Intelligence Network (DIN)
arXiv:2407.02461
Ackerman, P. L., & Heggestad, E. D. (1997).
Intelligence, personality, and interests:
Evidence for overlapping traits.
Psychological Bullen, 121(2), 219-245.
Amunts, K., Ebell, C., Müller, J., Telefont, M.,
Knoll, A., & Lippert, T. (2016). The Human
Brain Project: Creang a European Research
Infrastructure to Decode the Human Brain.
Neuron, 92(3), 574-581. DOI:
10.1016/j.neuron.2016.10.046
Antonia Creswell, Murray Shanahan, Irina
Higgins (2022) Selecon-Inference:
Exploing Large Language Models for
Interpretable Logical Reasoning
arXiv:2310.17512
Adly Templeton, Tom Conerly, Jonathan
Marcus, et al (2024) Scaling
Monosemancity: Extracng Interpretable
Features from Claude 3 Sonnet, Anthropic
Transformer Circuits Thread
Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J.,
Bennetot, A., Tabik, S., Barbado, A., ... &
Herrera, F. (2020). Explainable Arcial
Intelligence (XAI): Concepts, taxonomies,
opportunies and challenges toward
responsible AI. Informaon Fusion, 58, 82-
115.
Asngton, J. W., & Baird, J. A. (Eds.). (2005).
Why language maers for theory of mind.
Oxford University Press.
Basten, U., Hilger, K., & Fiebach, C. J. (2015).
Where smart brains are dierent:
quantave meta-analysis of funconal and
structural brain imaging studies on
intelligence. Intelligence, 51, 10-27.
Banks, J., & Oldeld, Z. (2007).
"Understanding pensions: Cognive
funcon, numerical ability and rerement
saving." Fiscal Studies, 28(2), 143-170.
Bay, G. D., Deary, I. J., & Goredson, L. S.
(2007). "Premorbid (early life) IQ and later
mortality risk: Systemac review." Annals of
Epidemiology, 17(4), 278-288.
Bay, G. D., Deary, I. J., & Der, G. (2018).
Does IQ explain socioeconomic inequalies
in health? BMJ, 348, g4163.
Bay, G. D., Deary, I. J., & Goredson, L. S.
(2009). Premorbid (early life) IQ and later
mortality risk: Systemac review. Annals of
Epidemiology, 19(5), 331-338.
Bengio, Y., Dafoe, A., & Russell, S. (2021).
Governing AI to avert risks. Nature Machine
Intelligence, 3(1), 5-6.
Benton, D. (2010). "The inuence of dietary
status on the cognive performance of
children." Molecular Nutrion & Food
Research, 54(4), 457-470.
Belsky, D. W., Mo, T. E., & Caspi, A.
(2017). "The longitudinal study of aging in
human young adults: Knowledge gaps and
research agenda." Journal of Gerontology:
Series B, 72(5), 722-731
Bereiter, C. (1995). "A disposional view of
transfer." Educaonal Psychologist, 30(3),
197-221.
Binet, A., & Simon, T. (1905). Méthodes
nouvelles pour le diagnosc du niveau
intellectuel des anormaux. Année
Psychologique, 11, 191-244.
Blaise Agüera y Arcas, Jyrki Alakuijala,
James Evans, Ben Laurie, Alexander
Mordvintsev, Eyvind Niklasson, Eore
Randazzo, Luca Versari (2024)
Computaonal Life: How Well-formed, Self-
replicang Programs Emerge from Simple
Interacon arXiv:2406.19108
Bostrom, N. (2014). Superintelligence:
Paths, dangers, strategies. Oxford
University Press.
Bouchard, T. J., Lykken, D. T., McGue, M.,
Segal, N. L., & Tellegen, A. (1990). Sources of
human psychological dierences: The
Minnesota Study of Twins Reared Apart.
Science, 250(4978), 223-228.
Brown, T. B., Mann, B., Ryder, N., Subbiah,
M., Kaplan, J., Dhariwal, P., ... & Amodei, D.
(2020). Language models are few-shot
learners. arXiv preprint
arXiv:2005.14165.systems (pp. 5998-6008)
& arXiv:1706.03762v7
Brundage, M., Avin, S., Clark, J., et al. (2018).
The malicious use of arcial intelligence:
Forecasng, prevenon, and migaon.
arXiv preprint arXiv:1802.07228.
Byrne, R. W., & Whiten, A. (Eds.). (1988).
Machiavellian Intelligence: Social Experse
and the Evoluon of Intellect in Monkeys,
Apes, and humans. Oxford University Press.
Calvin, C. M., Bay, G. D., Lowe, G. D., &
Deary, I. J. (2017). Intelligence in youth and
all-cause-mortality: Systemac review with
meta-analysis. Internaonal Journal of
Epidemiology, 46(1), 335-346.
Caspi, A., Wright, B. R. E., Mo, T. E., &
Silva, P. A. (1998). "Early failure in the labor
market: Childhood and adolescent
predictors of unemployment in the
transion to adulthood." American
Sociological Review, 63(3), 424-451.
Caell, R. B. (1941). "Some theorecal
issues in adult intelligence tesng."
Psychological Bullen, 38(9), 592-611.
Caell, R. B. (1963). "Theory of uid and
crystallized intelligence: A crical
experiment." Journal of Educaonal
Psychology, 54(1), 1-22.
Carroll, J. B. (1993). Human cognive
abilies: A survey of factor-analyc studies.
Cambridge University Press.
Carrington, S. J., & Bailey, A. J. (2009). Are
there theory of mind regions in the brain? A
review of the evidence. Brain and Cognion,
70(3), 211-222.
Cho, K., Van Merriënboer, B., Gulcehre, C.,
Bahdanau, D., Bougares, F., Schwenk, H., &
Bengio, Y. (2014). Learning phrase
representaons using RNN encoder-decoder
for stascal machine translaon. arXiv
preprint arXiv:1406.1078.
Chollet, F. (2019). On the measure of
intelligence. arXiv preprint
arXiv:1911.01547.
Chen Gao, Xiaochong Lan et al (2023) Large
Language Models Empowered Agent-based
Modeling and Simulaon: A Survey and
Perspecves arXiv:2312.11970
Chrisan, B. (2021). The Alignment Problem:
Machine Learning and Human Values. W. W.
Norton & Company.
Crevier, D. (1993). AI: The Tumultuous
History of the Search for Arcial
Intelligence. Basic Books.
Darwin, C., Wallace, A.R. (1858) On the
Tendency of Species to form Variees; and
on the Perpetuaon of Variees and Species
by Natural Means of Selecon. Journal of
the Proceedings of the Linnean Society,
Zoology 3:46-50
Dayeol Lee, Jorge António, Hisham Khan
(2024) Privacy-Preserving Decentralized AI
with Condenal Compung
arXiv:2410.13752
Davies, G., Lam, M., Harris, S. E., Trampush,
J. W., Luciano, M., Hill, W. D., ... & Deary, I. J.
(2018). Study of 300,486 individuals
idenes 148 independent genec loci
associated with general cognive funcon.
Nature Communicaons, 9(1), 2098.
Deary, I. J., Bay, G. D., & Gale, C. R. (2008).
"Childhood intelligence predicts voter
turnout, vong preferences, and polical
involvement in adulthood: The 1970 Brish
Cohort Study." Intelligence, 36(6), 548-555
Deary, I. J., & Caryl, P. G. (1997).
"Neuroscience and human intelligence
dierences." Trends in Neurosciences, 20(8),
365-371.
Deary, I. J., Harris, S. E., & Hill, W. D. (2019).
What genome-wide associaon studies
reveal about the associaon between
intelligence and physical and mental health.
Current Opinion in Psychology, 27, 6-12.
Deary, I. J., Whalley, L. J., Lemmon, H.,
Crawford, J. R., & Starr, J. M. (2000). The
stability of individual dierences in mental
ability from childhood to old age: Follow-up
of the 1932 Scosh Mental Survey.
Intelligence, 28(1), 49-55.
Deary, I. J., Strand, S., Smith, P., &
Fernandes, C. (2007). Intelligence and
educaonal achievement. Intelligence,
35(1), 13-21.
Deary, I. J., Weiss, A., & Bay, G. D. (2010).
Intelligence and personality as predictors of
illness and death: How researchers in
dierenal psychology and chronic disease
epidemiology are collaborang to
understand and address health inequalies.
Psychological Science in the Public Interest,
11(2), 53-79.
Deary, I. J., Johnson, W., & Houlihan, L. M.
(2009). "Genec foundaons of human
intelligence." Human Genecs, 126(1), 215-
232.
Dekhtyar S. et al. (2015). A life-course study
on the associaon of educaon with
Alzheimer's disease risk. Neurology, 85(10),
896-903. DOI:
10.1212/WNL.0000000000001872.
Devlin, J., Chang, M. W., Lee, K., &
Toutanova, K. (2018). Bert: Pre-training of
deep bidireconal transformers for
language understanding. arXiv preprint
arXiv:1810.04805.
Dickson, H., Laurens, K. R., Cullen, A. E., &
Hodgins, S. (2021). The relaonship
between intelligence and adult mental
health: A mulvariate analysis of
schizophrenia, bipolar disorder, and
depression. Schizophrenia Bullen, 47(3),
602-610.
Dunbar, R. I. M. (1993). "Coevoluon of
neocorcal size, group size and language in
humans." Behavioral and Brain Sciences,
16(4), 681-735.
Dunbar, R. I. M. (1998). "The social brain
hypothesis." Evoluonary Anthropology,
6(5), 178-190.
Dunbar, R. I. M. (1996). Gossip, Grooming,
and the Evoluon of Language. Harvard
University Press.
Dunbar, R. I. M., & Shultz, S. (2007).
"Evoluon in the social brain." Science,
317(5843), 1344-1347.
Dunn, J., & Brophy, M. (2020). Mind-
mindedness in parents, social cognion in
children: Understanding aachment and
reecve funconing. Developmental
Psychology, 56(4), 752-764.
Fergusson, D. M., & Horwood, L. J. (1997).
"Early disrupve behavior, IQ, and
educaonal achievement." Journal of
Abnormal Child Psychology, 25(3), 241-253
Ferreira D. et al. (2016). Cognive reserve
and the dynamic genome: The role of
lifestyle in Alzheimer’s disease. Froners in
Aging Neuroscience, 8, 142. DOI:
10.3389/fnagi.2016.00142.
Fitch, W. T. (2020). The evoluon of
language: A comparave review. Froners
in Psychology, 11, 517.
Floridi, L. (2020). AI and Its New Winter:
From Myths to Realies. Philosophy &
Technology, 33(1), 1-3.
Flanagan, D. P., & Harrison, P. L. (2012).
Contemporary Intellectual Assessment:
Theories, Tests, and Issues (3rd ed.). Guilford
Press.
Flynn, J. R. (1987). "Massive IQ gains in 14
naons: What IQ tests really measure."
Psychological Bullen, 101(2), 171-191.
Friederici, A. D. (2017). Language in our
brain: The origins of a uniquely human
capacity. MIT Press
Friston, K. (2010). The free-energy principle:
A unied brain theory? Nature Reviews
Neuroscience, 11(2), 127–138.
Gabriel Béna, Dan F. M. Goodman
(2021/2024) Dynamics of specializaon in
neural modules under resource constraints
arXiv:2106.02626
Gale, C. R., Hatch, S. L., Bay, G. D., & Deary,
I. J. (2017). Intelligence in childhood and risk
of major depression in adulthood: The 1958
Naonal Child Development Survey.
Psychological Medicine, 47(8), 1357-1365.
Gallese, V., & Lako, G. (2005). The Brain's
Concepts: The Role of the Sensory-Motor
System in Reason and Language. Cognive
Neuropsychology, 22(3/4), 455-479.
Garcez, A. D., Lamb, L. C., & Gabbay, D. M.
(2009). Neural-symbolic cognive
reasoning. Springer Science & Business
Media.
Goertzel, B., & Pennachin, C. (Eds.). (2007).
Arcial general intelligence (Vol. 2).
Springer.
Goredson, L. S. (2002). "g: Highly general
and highly praccal." In R. J. Sternberg & E.
L. Grigorenko (Eds.), The General Factor of
Intelligence: How General Is It? Lawrence
Erlbaum Associates.
Goredson, L. S. (2004). Intelligence: Is it
the epidemiologists' elusive 'fundamental
cause' of social class inequalies in health?
Journal of Personality and Social Psychology,
86(1), 174-199.
Hadi Esmaeilzadeh, Reza Vaezi (2021)
Conscious AI arXiv:2105.07879
Haier, R. J., Jung, R. E., Yeo, R. A., Head, K., &
Alkire, M. T. (2009). The neuroanatomy of
general intelligence: Sex maers.
NeuroImage, 45(4), 1174-1182.
Haier, R. J., Siegel, B. V., Tang, C., Abel, L., &
Buchsbaum, M. S. (1992). "Intelligence and
changes in regional cerebral glucose
metabolic rate following learning."
Intelligence, 16(3-4), 415-426.
Hampshire, A., Higheld, R. R., Parkin, B. L.,
& Owen, A. M. (2012). Fraconang human
intelligence. Neuron, 76(6), 1225-1237.
Hassabis, D., Kumaran, D., Summereld, C.,
& Botvinick, M. (2017). Neuroscience-
inspired arcial intelligence. Neuron,
95(2), 245-258.
Hendler, J. (2008). Avoiding another AI
winter. IEEE Intelligent Systems, 23(2), 2-4.
Henning Zakrisson (2023/2024) Trinary
Decision Trees for handling missing data
arXiv:2309.03561v2
Herbert Simon, Cockburn, Ransbotham,
Gerbert, Bughin, Sudarshan, Herweijer
(2021) The Evoluonary Dynamics of
Arcial Intelligence Ecosystems ETH Zurich
Hinton, G. E., & Salakhutdinov, R. R. (2006).
Reducing the dimensionality of data with
neural networks. Science, 313(5786), 504-
507.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising
diusion probabilisc models. Advances in
Neural Informaon Processing Systems, 33,
6840-6851.
Hogan, R., & Kaiser, R. B. (2005). What we
know about leadership. Review of General
Psychology, 9(2), 169-180.
Horn, J. L., & Caell, R. B. (1966).
"Renement and test of the theory of uid
and crystallized general intelligences."
Journal of Educaonal Psychology, 57(5),
253-270.
Humphrey, N. K. (1976). "The social funcon
of intellect." In P. P. G. Bateson & R. A. Hinde
(Eds.), Growing Points in Ethology (pp. 303-
317). Cambridge University Press.
Hochreiter, S., & Schmidhuber, J. (1997).
Long short-term memory. Neural
computaon, 9(8), 1735-1780.
Insel, T. R., Landis, S. C., & Collins, F. S.
(2013). The NIH BRAIN Iniave. Science,
340(6133), 687-688. DOI:
10.1126/science.1239276
Jackson, T., & Wang, Y. (2013). "Cognive
competence and the use of social media:
Predictors of online engagement and
behavioral outcomes." Journal of Applied
Developmental Psychology, 34(3), 166-174.
Jesus Garcia Fernandez, Nasir Ahmad,
Marcel van Gerven (2024) Ornstein-
Uhlenbeck Adaptaon as a Mechanism for
Learning in Brains and Machines
arXiv:2410.13563
John Achterberg, Duncan Astle, D. Akarca
(2024) Exploring Self-Awareness in AI
Systems Restack.io
Jinheng Wang, Hansong Zhou, Ting Song,
Shaoguang Mao, Shuming Ma, Hongyu
Wang, Yan Xia, Furu Wei (2024) 1-bit AI
Infra: Part 1.1, Fast and Lossless BitNet
b1.58 Inference on CPUs,
arXiv:2410.16144v2
Jensen, A. R. (1998). The g factor: The
science of mental ability. Praeger.
Jung, R. E., & Haier, R. J. (2007). The Parieto-
Frontal Integraon Theory (P-FIT) of
intelligence: Converging neuroimaging
evidence. Behavioral and Brain Sciences,
30(2), 135-154.
Javadi, A. H., Emo, B., Howard, L. R., Zisch, F.
E., Yu, Y., Knight, R., & Spiers, H. J. (2017).
Hippocampal and prefrontal processing of
network topology to simulate the future.
Nature Communicaons, 8(1), 14652.
Jung, R. E., Segall, J. M., Bockholt, H. J.,
Flores, R. A., Smith, S. M., Chavez, R. S., &
Haier, R. J. (2010). Neuroanatomy of
creavity. Human Brain Mapping, 31(3),
398-409.
Jouppi, N. P., Young, C., Pal, N., Paerson,
D., Agrawal, G., Bajwa, R., ... & Yoon, D. H.
(2017). In-datacenter performance analysis
of a tensor processing unit. In Proceedings of
the 44th annual internaonal symposium on
computer architecture (pp. 1-12).
Kaufman, J. C., & Sternberg, R. J. (2010). The
Cambridge Handbook of Creavity.
Cambridge University Press.
Kingma, D. P., & Ba, J. (2014/2017). Adam: A
method for stochasc opmizaon. arXiv
preprint arXiv:1412.6980v9.
Kingma, D. P., & Welling, M. (2013/2022).
Auto-encoding variaonal bayes. arXiv
preprint arXiv:1312.6114v11.
Kipf, T. N., & Welling, M. (2017). Semi-
supervised classicaon with graph
convoluonal networks. arXiv preprint
arXiv:1609.02907v4.
Krizhevsky, A., Sutskever, I., & Hinton, G. E.
(2012). ImageNet classicaon with deep
convoluonal neural networks. Advances in
neural informaon processing systems, 25.
Kuncel, N. R., Ones, D. S., & Sacke, P. R.
(2014). Individual dierences as predictors
of work, educaonal, and broad life
outcomes. Personality and Individual
Dierences, 67, 3-14.
Lake, B. M., Ullman, T. D., Tenenbaum, J. B.,
& Gershman, S. J. (2017). Building machines
that learn and think like people. Behavioral
and brain sciences, 40.
LeCun, Y., Boser, B., Denker, J. S., Henderson,
D., Howard, R. E., Hubbard, W., & Jackel, L.
D. (1989). Backpropagaon applied to
handwrien zip code recognion. Neural
computaon, 1(4), 541-551.
LeCun, Y., Boou, L., Bengio, Y., & Haner, P.
(1998). Gradient-based learning applied to
document recognion. Proceedings of the
IEEE, 86(11), 2278-2324.
LeCun, Y., Boou, L., Orr, G. B., & Müller, K.
R. (1998). Ecient backprop. In Neural
networks: Tricks of the trade (pp. 9-50).
Springer, Berlin, Heidelberg.
LeCun, Y., Bengio, Y., & Hinton, G. (2015).
Deep learning. Nature, 521(7553), 436-444.
Lishuang Wang et al (2024) Tensorized
NeuroEvoluon of Augmenng Topologies
for GPU Acceleraon, arXiv:2404.01817
Livingston G. et al. (2020). Demena
prevenon, intervenon, and care: 2020
report of the Lancet Commission. The
Lancet, 396(10248), 413-446. DOI:
10.1016/S0140-6736(20)30367-6.
Lusardi, A., & Mitchell, O. S. (2007).
"Financial literacy and rerement
preparedness: Evidence and implicaons for
nancial educaon." Business Economics,
42(1), 35-44.
Maturana, H., & Varela, F. (1980).
Autopoiesis and Cognion: The Realizaon
of the Living. Reidel.
Maximilian Beck et al. (2024). xLSTM:
Extended Long Short-Term Memory
arXiv:2405.04517
McCarthy, J., Minsky, M. L., Rochester, N., &
Shannon, C. E. (1955). A proposal for the
Dartmouth summer research project on
arcial intelligence. AI Magazine, 27(4),
12-12.
McGrew, K. S. (2009). "The Caell-Horn-
Carroll theory of cognive abilies: Past,
present, and future." In D. P. Flanagan & P. L.
Harrison (Eds.), Contemporary intellectual
assessment: Theories, tests, and issues (pp.
136-182). Guilford Press.
McCulloch, W. S., & Pis, W. (1943). A logical
calculus of the ideas immanent in nervous
acvity. The bullen of mathemacal
biophysics, 5(4), 115-133.
McClelland, J. L., & Rumelhart, D. E. (1986).
Parallel distributed processing: Exploraons
in the microstructure of cognion. Volume 1:
Foundaons. MIT Press.
McGurn, B., Starr, J. M., Topfer, J. A., Pae,
A., Whiteman, M. C., & Deary, I. J. (2008).
Childhood cognive ability and risk of late-
onset Alzheimer and vascular demena.
Neurology, 71(14), 1051-1056.
Messick, S. (1989). Validity. In R. L. Linn (Ed.),
Educaonal measurement (pp. 13-103).
American Council on Educaon and
Macmillan.
Mingrui Chen, Weizhi Lu (2021) Deep
Learning to Ternary Hash Codes by
Connuaon arXiv:2107.07987
Mills, M. C., & Rahal, C. (2019). The GWAS
diversity monitor tracks diversity by disease
in real me. Nature Genecs, 51(2), 19-20.
Mineault, P., Zanichelli, N., Peng, J. Z., et al.
(2024). NeuroAI for AI Safety. arXiv preprint
arXiv:2411.18526.
Mo, T. E., Caspi, A., Harrington, H., &
Milne, B. J. (2002). "Males on the life-course-
persistent and adolescence-limited
ansocial pathways: Follow-up at age 26
years." Development and Psychopathology,
14(1), 179-207.
Mo, T. E. (1993). "Adolescence-limited
and life-course-persistent ansocial
behavior: A developmental taxonomy."
Psychological Review, 100(4), 674-701.
Münzer, S., Zimmer, H. D., Schwalm, M.,
Baus, J., & Aslan, I. (2020). Computer-
assisted navigaon and the acquision of
route and survey knowledge. Journal of
Environmental Psychology, 28(1), 17-30
Nassim Dehouche, Richard Blythman (2023)
A Blockchain Protocol for Human-in-the-
Loop AI arXiv:2211.10859
Neubauer, A. C., & Fink, A. (2009).
Intelligence and neural eciency: Measures
of brain acvaon versus measures of
funconal connecvity in the brain.
Intelligence, 37(2), 223-229.
Needham, A., Libertus, K., & Heider, L.
(2014). What do infants learn about
objects? The role of self-directed acon.
Infancy, 19(3), 372-392.
Newell, A., & Simon, H. A. (1976). Computer
science as empirical inquiry: Symbols and
search. Communicaons of the ACM, 19(3),
113-126.
Nie, N. H., Junn, J., & Stehlik-Barry, K. (1996).
Educaon and democrac cizenship in
America. University of Chicago Press.
Nilsson, N. J. (1983). Arcial intelligence
prepares for 2001. AI Magazine, 4(4), 7.
Nisbe, R. E., Aronson, J., Blair, C., Dickens,
W., Flynn, J., Halpern, D. F., & Turkheimer, E.
(2012). Intelligence: New ndings and
theorecal developments. American
Psychologist, 67(2), 130–159.
Northo, G., & Bermpohl, F. (2004). Corcal
midline structures and the self. Trends in
Cognive Sciences, 8(3), 102-107.
Northo, G. (2014). How do resng state
changes in the brain impact self-
consciousness? Froners in Human
Neuroscience, 8, 1-16.
Northo, G. (2018). The Spontaneous Brain:
From the Mind–Body to the World–Brain
Problem. MIT Press.
Northo, G. (2011). Self and others: A
neurophilosophical approach. Philosophy,
Ethics, and Humanies in Medicine, 6(1), 1-
19.
Nowak, M. A., & Sigmund, K. (2005).
"Evoluon of indirect reciprocity." Nature,
437(7063), 1291-1298.
Pahor, A., Jausovec, N., & Jausovec, K.
(2019). The relaonship between brain
eciency and intelligence: The role of
resng-state EEG and structural
connecvity. Brain and Cognion, 135,
103581.
Peterson, J. & Homan, M. (2022). Ethical
challenges in AI: A review of global policy
implicaons. AI & Society.
Planning for AGI and beyond (2023) OpenAI
Plomin, R., & von Stumm, S. (2018). The new
genecs of intelligence. Nature Reviews
Genecs, 19(3), 148-159.
Poulton, R., Mo, T. E., & Silva, P. A.
(2015). "The Dunedin Muldisciplinary
Health and Development Study: Overview of
the rst 40 years, with an eye to the future."
Social Psychiatry and Psychiatric
Epidemiology, 50(5), 679-693.
Premack, D., & Woodru, G. (1978). "Does
the chimpanzee have a theory of mind?"
Behavioral and Brain Sciences, 1(4), 515-
526.
Rao, R. P. N., & Ballard, D. H. (1999).
Predicve coding in the visual cortex: A
funconal interpretaon of some extra-
classical recepve-eld eects. Nature
Neuroscience, 2(1), 79–87.
Raven, J. C. (2000). Manual for Raven’s
Progressive Matrices and Vocabulary Scales.
Oxford Psychologists Press.
Ree, M. J., & Earles, J. A. (1991). Predicng
training success: Not much more than g.
Personnel Psychology, 44(2), 321-332.
Reise, S. P., Scheines, R., Widaman, K. F., &
Haviland, M. G. (2013). Muldimensionality
and Structural Coecient Bias in Structural
Equaon Modeling: A Bifactor Perspecve.
Educaonal and Psychological
Measurement, 73(1), 5–26.
Rezende, D. J., Mohamed, S., & Wierstra, D.
(2014). Stochasc backpropagaon and
approximate inference in deep generave
models. Internaonal conference on
machine learning (pp. 1278-1286). PMLR.
Rizzola, G., & Craighero, L. (2004). The
Mirror-Neuron System. Annual Review of
Neuroscience, 27, 169-192.
Roberts, B. W., & Kuncel, N. R. (2007). "The
power of personality: The comparave
validity of personality traits, socioeconomic
status, and cognive ability for predicng
important life outcomes." Perspecves on
Psychological Science, 2(4), 313-345.
Roid, G. H. (2003). Stanford-Binet
Intelligence Scales, Fih Edion (SB5).
Riverside Publishing.
Rosenbla, F. (1958). The perceptron: a
probabilisc model for informaon storage
and organizaon in the brain. Psychological
review, 65(6), 386.
Rumelhart, D. E., Hinton, G. E., & Williams,
R. J. (1986). Learning representaons by
back-propagang errors. Nature,
323(6088), 533-536.
Rutwik Jain, Brandon Tran, Keng Chen,
Mahew D. Sinclair and Shivaram
Venkataraman (2024) PAL: A Variability-
Aware Policy for Scheduling ML Workloads
in GPU Clusters arXiv:2408.11919
Salgado, J. F., Anderson, N., & Viswesvaran,
C. (2020). Meta-analyc examinaon of the
predicve validity of general mental ability
for remote job performance. Journal of
Occupaonal and Organizaonal
Psychology, 93(2), 343-367.
Samuel Schmidgall, Rojin Ziaei, Jascha
Achterberg, Louis Kirsch, S Hajiseyedrazi,
Jason Eshraghian (2023), Brain-inspired
learning in arcial neural networks: a
review arXiv:2305.11252.
Savage, J. E., Jansen, P. R., Stringer, S.,
Watanabe, K., Bryois, J., de Leeuw, C. A., ...
& Posthuma, D. (2018). Genome-wide
associaon meta-analysis in 269,867
individuals idenes new genec and
funconal links to intelligence. Nature
Genecs, 50(7), 912-919.
Schmidt, F. L., & Hunter, J. E. (1998). "The
validity and ulity of selecon methods in
personnel psychology: Praccal and
theorecal implicaons of 85 years of
research ndings." Psychological Bullen,
124(2), 262-274.
Schmidhuber, J. (2015). Deep learning in
neural networks: An overview. Neural
Networks, 61, 85-117.
Schneider, W. J., & McGrew, K. S. (2012).
"The Caell-Horn-Carroll model of
intelligence." In D. P. Flanagan & P. L.
Harrison (Eds.), Contemporary intellectual
assessment: Theories, tests, and issues (pp.
99-144). Guilford Press.
Schrank, F. A., McGrew, K. S., & Mather, N.
(2014). Woodcock-Johnson IV.
Shane, S. (2003). A General Theory of
Entrepreneurship: The Individual-
Opportunity Nexus. Edward Elgar
Publishing.
Sidney Pontes-Filho, Stefano Nichele
(2019/2020) A Conceptual Bio-Inspired
Framework for the Evoluon of Arcial
General Intelligence arXiv:1903.10410v4
Smith, L. B., & Gasser, M. (2005). The
development of embodied cognion: Six
lessons from babies. Arcial Life, 11(1-2),
13-29.
Soldan A. et al. (2017). Cognive reserve
and Alzheimer's biomarkers. Neurology,
88(12), 1098-1106. DOI:
10.1212/WNL.0000000000003692.
Spearman, C. (1904). "General intelligence,
objecvely determined and measured."
American Journal of Psychology, 15, 201-
292.
Srivastava, N., Hinton, G., Krizhevsky, A.,
Sutskever, I., & Salakhutdinov, R. (2014).
Dropout: a simple way to prevent neural
networks from overng. The journal of
machine learning research, 15(1), 1929-
1958.
Srinivasa Deshmukh, Panait Luke (2022) AI
and the Sense of Self arXiv:2201.05576
Stanley & Miikkulainen (2002) Evolving
Neural Networks Through Augmenng
Topologies, Evoluonary Computaon
Stern Y. et al. (2019). Cognive reserve and
Alzheimer's disease. Neurology, 92(1), 10-
12. DOI: 10.1212/WNL.0000000000007122.
Sternberg, R. J. (2004). Successful
intelligence: Finding a balance. Yale
University Press.
Sparrow, B., Liu, J., & Wegner, D. M. (2011).
Google Eects on Memory: Cognive
Consequences of Having Informaon at Our
Fingerps. Science, 333(6043), 776-778.
Suzuki, S., et al. (2022). Dynamic Modulaon
of Sensory Processing by Motor State.
Journal of Neuroscience, 42(10), 2001-2016.
Taddeo, M., & Floridi, L. (2018). Regulate
arcial intelligence to avert cyber arms
race. Nature, 556(7701), 296-298.
Tatjana Wingarz, Anne Lauscher, Janick
Edinger, Dominik Kaaser, Stefan Schulte,
Mathias Fischer (2024) SoK: Towards
Security and Safety of Edge AI
arXiv:2410.05349
Thompson, N. C., Greenewald, K., Lee, K., &
Manso, G. F. (2022). The computaonal
limits of deep learning. arXiv preprint
arXiv:2007.05558v2.
Touvron, H., Lavril, T., Izacard, G., Marnet,
X., Lachaux, M. A., Lacroix, T., ... & Lample,
G. (2023). LLaMA: Open and Ecient
Foundaon Language Models. arXiv
preprint arXiv:2302.13971.
Tomasello, M. (2014). A Natural History of
Human Thinking. Harvard University Press.
Thiele, J. A., Faskowitz, J., Sporns, O., &
Hilger, K. (2024). Choosing explanaon over
performance: Insights from machine
learning-based predicon of human
intelligence from brain connecvity. PNAS
Nexus, 3, pgae519.
hps://doi.org/10.1093/pnasnexus/pgae51
9
Thurstone, L. L. (1938). Primary mental
abilies. University of Chicago Press.
Vahdat, A., & Kautz, J. (2020). NVAE: A deep
hierarchical variaonal autoencoder. In
Advances in Neural Informaon Processing
Systems (pp. 19667-19679).
Vaswani, A., Shazeer, N., Parmar, N.,
Uszkoreit, J., Jones, L., Gomez, A. N., ... &
Polosukhin, I. (2017/2023). Aenon is all
you need. In Advances in neural informaon
processing
Vincent, P., Larochelle, H., Bengio, Y., &
Manzagol, P. A. (2008). Extracng and
composing robust features with denoising
autoencoders. In Proceedings of the 25th
internaonal conference on Machine
learning (pp. 1096-1103).
Wechsler, D. (1997). WAIS-III: Wechsler
Adult Intelligence Scale (3rd ed.).
Psychological Corporaon.
Wechsler, D. (2008). Wechsler Adult
Intelligence Scale - Fourth Edion (WAIS-IV).
Pearson.
Wechsler, D. (2014). Wechsler Intelligence
Scale for Children - Fih Edion (WISC-V).
Pearson.
Whalley, L. J., & Deary, I. J. (2001).
Longitudinal cohort study of childhood IQ
and survival up to age 76. BMJ, 322(7290),
819.
Whiten, A., & Byrne, R. W. (1997).
"Machiavellian intelligence II: Extensions
and evaluaons." In A. Whiten & R. W. Byrne
(Eds.), Machiavellian Intelligence II:
Extensions and Evaluaons (pp. 1-23).
Cambridge University Press.
Xu, X., Weber, I., & Staples, M. (2021).
Blockchain and Distributed Ledger
Technology Use Cases: Applicaons and
Opportunies. Springer.
Yujian Li (2018) Theory of Cognive
Relavity: A Promising Paradigm for True AI
arXiv:1812.00136
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU) --- deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN). The heart of the TPU is a 65,536 8-bit MAC matrix multiply unit that offers a peak throughput of 92 TeraOps/second (TOPS) and a large (28 MiB) software-managed on-chip memory. The TPU's deterministic execution model is a better match to the 99th-percentile response-time requirement of our NN applications than are the time-varying optimizations of CPUs and GPUs that help average throughput more than guaranteed latency. The lack of such features helps explain why, despite having myriad MACs and a big memory, the TPU is relatively small and low power. We compare the TPU to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the same datacenters. Our workload, written in the high-level TensorFlow framework, uses production NN applications (MLPs, CNNs, and LSTMs) that represent 95% of our datacenters' NN inference demand. Despite low utilization for some applications, the TPU is on average about 15X -- 30X faster than its contemporary GPU or CPU, with TOPS/Watt about 30X -- 80X higher. Moreover, using the CPU's GDDR5 memory in the TPU would triple achieved TOPS and raise TOPS/Watt to nearly 70X the GPU and 200X the CPU.
Article
Full-text available
Recent progress in artificial intelligence (AI) has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn, and how they learn it. Specifically, we argue that these machines should (a) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (b) ground learning in intuitive theories of physics and psychology, to support and enrich the knowledge that is learned; and (c) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes towards these goals that can combine the strengths of recent neural network advances with more structured cognitive models.
Article
Full-text available
Ongoing research on measures of individual differences (personality, cognitive ability, and admissions tests) has revealed their importance in academic success (including outcomes beyond college grades), work success (including objective and subjective measures of job performance), and everyday life (including divorce and mortality). Despite the body of evidence, confusion remains about foundational empirical questions including their strength, importance beyond a threshold, and independence from social class and other confounds. We first discuss the likely sources of confusion when considering the literature. We then review a series of large-scale studies and meta-analyses conducted to unambiguously address nine common, but false, assertions about the relationship between intelligence and personality measures with life outcomes.
Article
Full-text available
The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.
Article
Full-text available
Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day
Conference Paper
We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 dif- ferent classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implemen- tation of the convolution operation. To reduce overfitting in the fully-connected layers we employed a recently-developed regularization method called dropout that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry
Article
Creativity has long been a construct of interest to philosophers, psychologists and, more recently, neuroscientists. Recent efforts have focused on cognitive processes likely to be important to the manifestation of novelty and usefulness within a given social context. One such cognitive process - divergent thinking - is the process by which one extrapolates many possible answers to an initial stimulus or target data set. We sought to link well established measures of divergent thinking and creative achievement (Creative Achievement Questionnaire - CAQ) to cortical thickness in a cohort of young (23.7 +/- 4.2 years), healthy subjects. Three independent judges ranked the creative products of each subject using the consensual assessment technique (Amabile, 1982) from which a "composite creativity index" (CCI) was derived. Structural magnetic resonance imaging was obtained at 1.5 Tesla Siemens scanner. Cortical reconstruction and volumetric segmentation were performed with the FreeSurfer image analysis suite. A region within the lingual gyrus was negatively correlated with CCI; the right posterior cingulate correlated positively with the CCI. For the CAQ, lower left lateral orbitofrontal volume correlated with higher creative achievement; higher cortical thickness was related to higher scores on the CAQ in the right angular gyrus. This is the first study to link cortical thickness measures to psychometric measures of creativity. The distribution of brain regions, associated with both divergent thinking and creative achievement, suggests that cognitive control of information flow among brain areas may be critical to understanding creative cognition.
  • D P Kingma
  • J Ba
Kingma, D. P., & Ba, J. (2014/2017). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980v9.
Semisupervised classification with graph convolutional networks
  • T N Kipf
  • M Welling
Kipf, T. N., & Welling, M. (2017). Semisupervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907v4.