Content uploaded by Gualtiero Piccinini
Author content
All content in this area was uploaded by Gualtiero Piccinini on Apr 16, 2016
Content may be subject to copyright.
COMPUTATIONS AND COMPUTERS IN THE SCIENCES OF MIND AND BRAIN
by
Gualtiero Piccinini
BA, Università di Torino, 1994
Submitted to the Graduate Faculty of
University of Pittsburgh in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
University of Pittsburgh
2003
UNIVERSITY OF PITTSBURGH
FACULTY OF ARTS AND SCIENCES
This dissertation was presented
by
Gualtiero Piccinini
It was defended on
June 20th, 2003
and approved by
John Earman, University Professor of History and Philosophy of Science
G. Bard Ermentrout, Professor of Mathematics
Paul Griffiths, Professor of History and Philosophy of Science
John D. Norton, Professor of History and Philosophy of Science
Peter K. Machamer, Professor of History and Philosophy of Science
Dissertation Director
ii
To my parents and sisters
iii
Copyright by Gualtiero Piccinini
2003
Section 3.1 of the present work is an adapted version of section 2 of Gualtiero Piccinini, “Alan Turing and
the Mathematical Objection,” Minds and Machines 13(1), pp. 23-48. Copyright 2003 by Kluwer,
reproduced by permission.
The following archives have given permission to use extended quotations from unpublished work. From
the Warren S. McCulloch Papers, American Philosophical Society Library. Copyright by the American
Philosophical Society Library. From the Norbert Wiener Papers, Institute Archives and Special
Collections, MIT Libraries. Copyright by the MIT Libraries.
iv
COMPUTATIONS AND COMPUTERS IN THE SCIENCES OF MIND AND BRAIN
Gualtiero Piccinini, PhD
University of Pittsburgh, 2003
Computationalism says that brains are computing mechanisms, that is, mechanisms that perform
computations. At present, there is no consensus on how to formulate computationalism precisely or
adjudicate the dispute between computationalism and its foes, or between different versions of
computationalism. An important reason for the current impasse is the lack of a satisfactory philosophical
account of computing mechanisms. The main goal of this dissertation is to offer such an account.
I also believe that the history of computationalism sheds light on the current debate. By tracing
different versions of computationalism to their common historical origin, we can see how the current
divisions originated and understand their motivation. Reconstructing debates over computationalism in
the context of their own intellectual history can contribute to philosophical progress on the relation
between brains and computing mechanisms and help determine how brains and computing mechanisms
are alike, and how they differ. Accordingly, my dissertation is divided into a historical part, which traces
the early history of computationalism up to 1946, and a philosophical part, which offers an account of
computing mechanisms.
The two main ideas developed in this dissertation are that (1) computational states are to be
identified functionally not semantically, and (2) computing mechanisms are to be studied by functional
analysis. The resulting account of computing mechanism, which I call the functional account of
computing mechanisms, can be used to identify computing mechanisms and the functions they compute. I
use the functional account of computing mechanisms to taxonomize computing mechanisms based on
their different computing power, and I use this taxonomy of computing mechanisms to taxonomize
different versions of computationalism based on the functional properties that they ascribe to brains. By
doing so, I begin to tease out empirically testable statements about the functional organization of the brain
v
that different versions of computationalism are committed to. I submit that when computationalism is
reformulated in the more explicit and precise way I propose, the disputes about computationalism can be
adjudicated on the grounds of empirical evidence from neuroscience.
vi
PREFACE
In the 1940s, inspired by the birth and development of modern computers, Alan Turing, Warren
McCulloch, Norbert Wiener, John von Neumann, and many others developed a new theory of the brain,
here called computationalism. Computationalism says that brains are computing mechanisms, that is,
mechanisms that perform computations. Computationalism expands the old idea that reasoning is a form
of computation (from Hobbes to formal logic) into the stronger idea that all cognitive processes or even
all neural processes are a form of computation. In the past fifty years, computationalism has shaped
several fields. There are canonical explications of computationalism in computer science (Newell and
Simon 1976), psychology (Pylyshyn 1984, Rumelhart and McClelland 1986), neuroscience (Churchland,
Koch, and Sejnowski 1990), and philosophy (Fodor 1975).
Computationalists agree that brains are computing mechanisms, but in calling them computing
mechanisms, they mean such radically different things that they often talk across each other. For some,
the brain is functionally organized like the hardware of a desktop computer, on which different programs
can run (e.g., Newell and Simon, Pylyshyn, and Fodor). For others, the brain is a set of networks of
neurons, each of which computes its own function (e.g., Rumelhart and McClelland). Still others think
that computations take place in the dendrites of single neurons (e.g., Koch 1999). Some investigators
build computer simulations of cognitive phenomena studied by psychologists and argue that the
neurological details are irrelevant to understanding the computational organization of the brain (e.g.,
Newell 1990, Fodor and Pylyshyn 1988). Other investigators model neurological phenomena described
by neurophysiologists and maintain that on the contrary, it is simulating cognitive phenomena that has
nothing to do with the computational organization of the brain (e.g., Koch 1999, Dayan and Abbott 2001).
Philosophers interested in the sciences of mind and brain have generally divided into
computationalists and anti-computationalists. The former believe computationalism to be the best
scientific theory of the brain, or even the only genuinely scientific theory of the brain. They have offered
explications of computationalism and defended them on the grounds that they solve, or contribute to
vii
solve, important philosophical problems. Anti-computationalists often believe that computationalism is
absurd or false on a priori grounds, and have offered a number of objections to it (e.g., Searle 1980,
Penrose 1994). But anti-computationalists fiercely disagree on what is misguided about
computationalism.
At present, there is no consensus on how to formulate computationalism precisely or adjudicate
the dispute between computationalism and its foes, or between different versions of computationalism.
An important reason for the current impasse is the lack of a satisfactory philosophical account of
computing mechanisms. The main goal of this dissertation is to offer such an account. I also believe that
the history of computationalism sheds light on the current debate. By tracing different versions of
computationalism to their common historical origin, we can see how the current divisions originated and
understand their motivation. Reconstructing debates over computationalism in the context of their own
intellectual history can contribute to philosophical progress on the relation between brains and computing
mechanisms and help determine how brains and computing mechanisms are alike, and how they differ.
Accordingly, my dissertation is divided into a historical part, which traces the early history of
computationalism, and a philosophical part, which offers an account of computing mechanisms.
A good account of computing mechanisms can be used to express more explicitly and precisely
the content of different versions of computationalism, thereby allowing a more rigorous assessment of the
evidence for or against different versions of computationalism. Besides grounding discussions of
computationalism, there is independent motivation for an account of computing mechanisms. Given the
importance of computers within contemporary society, philosophical attention has been directed at them
in recent years (Floridi 1999). An account of computing mechanisms contributes to the emerging field of
philosophy of computation.
The first step towards an account of computing mechanisms is to put on the table the relevant
notion of computation. There is consensus that the notion of computation that is relevant to
computationalism, as well as to modern computer science and technology, is the one analyzed by Alan
Turing in terms of Turing Machines. Turing Machines are a mathematical formalism for manipulating
viii
symbols in accordance with fixed instructions so as to generate certain output strings of symbols from
input strings of symbols. Turing’s work on Turing Machines, together with work by other authors in the
1930s on the same notion of computability, led to the development of the classical mathematical theory of
computability. The notion of Turing Machine, together with some important results of computability
theory, is briefly reviewed in Chapter 1.
By building on Turing’s notion of computability, Warren McCulloch and others developed
computationalism in the 1940s. Chapters 2 through 6 describe how this happened. Chapter 2 describes
the pre-Turing background to McCulloch’s work. Chapter 3 describes McCulloch’s efforts to formulate a
mechanistic theory of mind and the impact of Turing’s work on his effort. Chapter 4 is a detailed analysis
of the first—and in my view the most influential—formulation of contemporary computationalism,
McCulloch and Pitts’s “A Logical Calculus of the Ideas Immanent in Nervous Activity.” Chapter 5 and 6
describe early discussions (1943-1946) between McCulloch and others pertaining to brains, computers,
and their mutual relations. This historical background paves the way for the philosophical part of this
dissertation, which is contained in the remaining four chapters.
The foundation of the classical theory of computability is the thesis that any function that is
effectively calculable (in an intuitive sense) is computable by some Turing Machine. This thesis, called
Church-Turing thesis (CT), has given rise to three relevant streams of philosophical literature. First,
some philosophers of mind have assumed CT to be true and used it as a priori evidence for
computationalism. Second, some logicians and philosophers of mathematics have debated whether CT is
true and what its proper scope is. And third, some physicists and philosophers of physics have debated
whether CT applies to the physical world. These three streams of literature have proceeded largely
independently of one another, with regrettable effects. In particular, arguments from CT to
computationalism have not taken into account the best scholarship on CT’s proper scope and the way CT
applies to the physical world.
To remedy this situation, Chapter 7 begins to bring these three streams of literature together. It
clarifies the proper scope of CT, the sense in which CT applies to the physical world, and proceeds to
ix
assess arguments from CT to computationalism. It concludes that all such arguments are unsound,
because CT—when properly understood—does not establish whether cognitive or neural processes (or
any other processes) are computations.
After this assessment of CT’s relevance to computationalism, the road is clear to discuss two
philosophical topics that have heavily affected both discussions of computationalism and the way
computing mechanisms are understood in the philosophical literature. These topics are the mind-body
problem and the problem of giving a naturalistic explanation of intentionality. The next two chapters are
devoted to how computationalism and computing mechanisms relate to these topics.
First, I discuss the relevance of the mind-body problem to computationalism. Computational
functionalism is the thesis that the mind is the software of the brain, which entails that the brain is a
computing mechanism (for running mental programs). Computational functionalism was proposed as a
solution to the mind-body problem (Putnam 1967b), and became very influential. As a result, many
philosophers became convinced that computationalism is a consequence of this popular solution to the
mind-body problem.
In Chapter 8, I argue that this is not the case. First, I employ the language of functional analysis
to explicate the idea that the mind is a program running on the brain. Then I distinguish functionalism,
namely the thesis that the mind is the functional organization of the brain given by some functional
analysis of the brain (which may or may not ascribe computations to it), from the stronger computational
functionalism, namely the thesis that the functional organization of the brain is given by a (mental)
computer program. I argue that none of the arguments given for functionalist (including computational
functionalist) solutions to the mind-body problem offer any support for the conclusion that the mind is a
program running on the brain. Accordingly, I argue that this latter view should be considered as an
empirical hypothesis about the kind of functional analysis that applies to minds and brains. As a
consequence, computational functionalism should be considered as the conjunction of a functionalist
solution to the mind-body problem and the empirical hypothesis that the brain is functionally organized in
accordance with a mental program.
x
After the mind-body problem, I discuss the relevance to computationalism of the naturalistic
explanation of intentionality, i.e. the problem of finding a naturalistic explanation for the content of
mental states. The semantic view of computational states is the view that inputs, internal states, and
outputs of computing mechanisms have their content essentially, i.e., computational inputs, outputs, and
internal states can be identified only by reference to their semantic properties. Almost all
computationalist philosophers believe the semantic view of computational states. If the semantic view of
computational states is correct, it gives some hope for a naturalistic explanation of intentionality. For on
the one hand, no one doubts that computing mechanisms are natural objects that can be given naturalistic
explanations. On the other hand, the semantic view of computational states entails that computationalism
ascribes to the brain states that are essentially endowed with content. If the essentially contentful states
ascribed to the brain by computationalism coincide with the mental states whose intentionality many
philosophers would like to explain naturalistically, then computationalism offers the basis for a
naturalistic explanation of intentionality. This has acted as a powerful motivation for computationalism.
In Chapter 9, I reject the semantic view of computational states in favor of the view that
computational inputs, outputs, and internal states are identified by their functional properties, as described
by a specific kind of functional analysis. This view fits better than the semantic view with the practices of
computability theorists and computer designers, but it undercuts one of the main traditional philosophical
motivations for computationalism.
The two main ideas developed in Chapters 8 and 9 are that (1) computational states are to be
identified functionally not semantically, and (2) computing mechanisms are to be studied by functional
analysis. These ideas come to fruition in Chapter 10, where the relevant kind of functional analysis is
spelled out. The resulting account of computing mechanism, which I call the functional account of
computing mechanisms, can be used to identify computing mechanisms and the functions they compute. I
use the functional account of computing mechanisms to taxonomize computing mechanisms based on
their different computing power, and I use this taxonomy of computing mechanisms to taxonomize
different versions of computationalism based on the functional properties that they ascribe to brains. By
xi
doing so, I begin to tease out empirically testable statements about the functional organization of the brain
that different versions of computationalism are committed to. I submit that when computationalism is
reformulated in the more explicit and precise way I propose, the disputes about computationalism can be
adjudicated on the grounds of empirical evidence from neuroscience.
xii
ACKNOWLEDGEMENTS
My greatest debt is to Peter Machamer, my advisor. The ways in which he helped me are too numerous
to enumerate. My committee members, John Earman, Bard Ermentrout, Paul Griffiths, and John Norton,
gave me the right balance of constructive criticism and support. John Norton also guided me through my
initial search for a dissertation project.
I wrote my first paper on this topic for a class I took with Ken Manders. He kindly advised me to
pursue my investigations further. At the time, Carl Craver was writing his dissertation on mechanisms in
neuroscience. I jokingly asked Carl to do a good job so I could use his conclusions in my research. I’m
pleased to say that I was influenced by Carl’s dissertation as well as his subsequent work (e.g.,
Machamer, Darden and Craver 2000, Craver 2001a). Another important early influence on my
dissertation was Wilfried Sieg’s work on the history and philosophy of computation (e.g., Sieg 1994). In
the last decade, Sieg and Jack Copeland have done more than anyone before them to clarify the Church-
Turing thesis and fight the misconceptions about the Church-Turing thesis that pervade the philosophy of
mind literature. My work, especially in Chapter 7, builds on theirs. While working on Chapters 8
through 10, I gained the most insight into computational theories of mind and brain by reading the works
of Jerry Fodor. If I have succeeded at all in moving the debate forward, I owe it to a large extent to how I
responded to Fodor’s writings.
A number of people have given me feedback on parts of my dissertation or related material: Bob
Brandom, Jack Copeland, Carl Craver, Reinaldo Elugardo, Uljana Feest, Clark Glymour, Rick Grush,
Graham Hubbs, Ken Manders, Diego Marconi, Jim Moor, Bob Olby, Anastasia Panagopoulos, Elizabeth
Paris, Merrilee Salmon, Andrea Scarantino, Susan Schneider, Oron Shagrir, Wilfried Sieg, Susan Sterrett,
Julie Yoo, and Julie Zahle. If I have forgotten anyone, I apologize.
Other people helped me by conversing or corresponding with me on topics related to my
dissertation. They include Erik Angner, Robert S. Cox, John Haugeland, Lance Lugar, Valerie-Anne
xiii
Lutz, Jay McClelland, Wendy Parker, and Martha Pollack. Again, I apologize to anyone I inadvertently
omitted.
As a graduate student unknown to them, I wrote or approached Brian Davies, Daniel Dennett,
Jerry Fodor, Gilbert Harman, Hilary Putnam, Dave Touretzky, and Bernard Widrow concerning my
research. They generously responded with helpful remarks for which I am grateful.
I have presented parts of my dissertation to various philosophical audiences. Thanks to those
present for their attention and responses.
I’d like to thank all my philosophy teachers from high school to college to graduate school, as
well as all others who taught me the little I know about the ways of philosophy and of life. Without their
example and encouragement, I would not be where I am.
Finally, I am grateful to Becka Skloot for her help and support during all these years.
My research was supported in part by the National Science Foundation under Grant No. SES-
0216981, by an Adelle and Erwin Tomash Fellowship, by an Andrew Mellon Predoctoral Fellowship, and
by a Regione Sardegna Doctoral Scholarship. I am grateful to those institutions, the administrators who
run them, the politicians who back them, and the taxpayers who ultimately fund them. Any opinions,
findings, conclusions, and recommendations expressed in this dissertation are those of the author and do
not necessarily reflect the views of these funding institutions.
xiv
TABLE OF CONTENTS
PREFACE..................................................................................................................................... vii
ACKNOWLEDGEMENTS.........................................................................................................xiii
1 COMPUTABILITY................................................................................................................ 1
1.1 Effective Calculability .................................................................................................... 1
1.2 Computability Theory..................................................................................................... 4
1.2.1 Notation................................................................................................................... 4
1.2.2 Recursive Functions................................................................................................ 5
1.2.3 Turing Machines ..................................................................................................... 7
1.2.4 Gödel Numbers of TM Programs ......................................................................... 11
1.2.5 Universal TM programs........................................................................................ 13
1.2.6 Unsolvability of the Halting Problem................................................................... 13
1.3 The Church-Turing Thesis............................................................................................ 14
2 WARREN MCCULLOCH ON LOGIC, MIND, AND BRAIN, CA. 1920-1936 ............... 15
2.1 Introduction................................................................................................................... 15
2.2 Background................................................................................................................... 16
2.3 Logic, Epistemology, and the Brain ............................................................................. 19
2.4 Strychnine Neuronography and the Functional Organization of the Brain .................. 25
3 TOWARDS A THEORY OF THE BRAIN, 1936-1942 ...................................................... 30
3.1 What Computing Mechanisms Can Do ........................................................................ 30
3.2 Teleological Mechanisms ............................................................................................. 37
3.3 Walter Pitts.................................................................................................................... 42
3.4 McCulloch Meets Pitts.................................................................................................. 44
4 BRAINS COMPUTE THOUGHTS, 1943 ........................................................................... 46
4.1 A Mechanistic Theory of Mind..................................................................................... 46
4.2 Motivation..................................................................................................................... 49
4.3 Assumptions.................................................................................................................. 50
4.4 Nets Without Circles..................................................................................................... 52
4.5 Nets With Circles, Computation, and the Church-Turing Thesis................................. 55
4.6 “Consequences” ............................................................................................................ 58
4.7 The Historical Significance of McCulloch-Pitts Nets .................................................. 61
5 FROM BRAINS TO COMPUTERS AND BACK AGAIN, 1943-1945 ............................. 65
5.1 Migrations..................................................................................................................... 65
5.2 Brains and Computers................................................................................................... 68
5.3 Electronic Brains........................................................................................................... 71
5.4 A Research Program ..................................................................................................... 77
5.5 Preparing for the End of the War.................................................................................. 84
6 THE NEW SCIENCE OF BRAINS AND MACHINES, 1946............................................ 89
6.1 The First Macy Meeting................................................................................................ 89
6.2 The Next Generation..................................................................................................... 96
6.3 More Meetings............................................................................................................ 100
xv
6.4 Von Neumann’s New Thoughts on Automata............................................................ 105
6.5 Importance of the Computationalist Network............................................................. 109
7 COMPUTATIONALISM AND THE CHURCH-TURING THESIS ................................ 112
7.1 Introduction................................................................................................................. 112
7.2 The Church-Turing Thesis.......................................................................................... 113
7.2.1 The Canonical View: CT is True but Unprovable (Kleene 1952, § 62, § 67).... 113
7.2.2 Optimistic View 1: CT is True and Provable (Mendelson 1990) ....................... 114
7.2.3 Optimistic View 2: CT is True Because Entailed by Physical Facts (Deutsch
1985) ............................................................................................................................. 116
7.2.4 The Gandy-Sieg View (Gandy 1980; Sieg 2000)............................................... 116
7.2.5 Pessimistic View 1: CT is False Because Contradicted by Non-uniform Effective
Procedures (Kálmar 1959) .................................................................................................. 119
7.2.6 Pessimistic View 2: CT is False Because Contradicted by Non-mechanical
Effective Procedures (Gödel 1965, 1972)........................................................................... 120
7.2.7 Pessimistic View 3: CT is False Because Contradicted by Physical Facts (Hogarth
1994) ............................................................................................................................. 121
7.2.8 Other Objections to CT....................................................................................... 122
7.3 Physical CT................................................................................................................. 123
7.3.1 Modest Physical CT............................................................................................ 123
7.3.2 Hypercomputation............................................................................................... 126
7.3.3 Bold Physical CT ................................................................................................ 132
7.3.4 Between Modest and Bold Physical CT ............................................................. 135
7.3.4.1 Mathematical Tractability............................................................................... 135
7.3.4.2 Computational Approximation ....................................................................... 136
7.4 Computationalism and CT.......................................................................................... 139
7.4.1 By Physical CT ................................................................................................... 140
7.4.1.1 By Modest Physical CT .................................................................................. 141
7.4.1.2 By Bold Physical CT ...................................................................................... 142
7.4.2 Cognition as an Effective Procedure................................................................... 143
7.4.3 Effective Procedures as a Methodological Constraint on Psychological Theories...
............................................................................................................................. 146
7.5 Conclusion .................................................................................................................. 149
8 COMPUTATIONAL FUNCTIONALISM ........................................................................ 151
8.1 Introduction................................................................................................................. 151
8.2 Multiple Realizability and Computational Functionalism.......................................... 155
8.3 Multiple Realizability, Functional Analysis, and Program Execution ....................... 157
8.3.1 Multiple Realizability and Functional Analysis ................................................. 160
8.3.2 Multiple Realizability and Explanation by Program Execution ......................... 160
8.3.3 Functional Analysis and Program Execution...................................................... 161
8.3.4 Computational Functionalism Revisited............................................................. 163
8.4 Origin of Computational Functionalism ..................................................................... 166
8.4.1 The Brain as a Turing Machine .......................................................................... 166
8.4.2 The Analogy between Minds and Turing Machines........................................... 168
8.4.3 Psychological Theories, Functional Analysis, and Programs............................. 170
8.4.4 Functionalism...................................................................................................... 173
8.4.5 Computational Functionalism............................................................................. 174
xvi
8.4.6 Functional Analysis and Explanation by Program Execution ............................ 176
8.5 Later Developments of Functionalism........................................................................ 180
8.6 Is Everything a TM? ................................................................................................... 183
8.7 How Did This Happen? .............................................................................................. 195
8.8 Functionalism and Computationalism ........................................................................ 198
9 COMPUTATION AND CONTENT .................................................................................. 200
9.1 Introduction................................................................................................................. 200
9.2 The Functional View of Computational States........................................................... 203
9.3 Against the Semantic View of Computational States................................................. 209
9.4 Origins of the Semantic View of Computational States............................................. 217
9.4.1 Content in Early Computationalism.................................................................... 218
9.4.2 Conceptual Role Semantics ................................................................................ 221
9.4.3 Computationalism and the Philosophy of Mind ................................................. 222
9.4.4 The Semantic View of Computational States in the Philosophy of Mind.......... 225
9.5 Computationalism and Theories of Content ............................................................... 228
9.5.1 CTM meets Conceptual Role Semantics ............................................................ 228
9.5.2 CTM meets Interpretational Semantics .............................................................. 231
9.5.3 CTM meets Informational and Teleological Semantics ..................................... 236
9.5.4 CTM meets Intentional Eliminativism................................................................ 240
9.6 CTM With or Without Semantics............................................................................... 243
9.7 Two Consequences ..................................................................................................... 245
10 COMPUTING MECHANISMS ..................................................................................... 248
10.1 Introduction................................................................................................................. 248
10.1.1 Desiderata for an Account of Computing Mechanisms...................................... 249
10.2 The Functional Account of Computing Mechanisms ................................................. 251
10.2.1 Primitive Computing Components ..................................................................... 253
10.2.1.1 Comparison with Cummins’s Account....................................................... 259
10.2.2 Primitive Non-computing Components .............................................................. 261
10.2.3 Complex Computing Components...................................................................... 262
10.2.3.1 Combinational Computing Components..................................................... 263
10.2.3.2 Arithmetic Logic Units ............................................................................... 267
10.2.3.3 Sequential Computing Components ........................................................... 267
10.2.3.4 Multiplication and Division Components................................................... 268
10.2.3.5 The Computation Power of Complex Computing Components................. 269
10.2.4 Complex Non-computing Components .............................................................. 271
10.2.4.1 Memory Units............................................................................................. 271
10.2.4.2 Datapaths..................................................................................................... 272
10.2.4.3 Control Units............................................................................................... 274
10.2.4.4 Input and Output Devices ........................................................................... 276
10.2.4.5 Internal Semantics....................................................................................... 276
10.2.5 Calculators .......................................................................................................... 278
10.2.6 Computers........................................................................................................... 280
10.2.6.1 Programmability ......................................................................................... 281
10.2.6.2 Stored-Program Computers ........................................................................ 283
10.2.6.3 Special-Purpose, General-Purpose, or Universal........................................ 285
10.2.6.4 Functional Hierarchies................................................................................ 286
xvii
10.2.6.5 Digital vs. Analog ....................................................................................... 289
10.2.6.6 Serial vs. Parallel ........................................................................................ 294
10.3 Comparison With Previous Accounts of Computing Mechanisms ............................ 296
10.3.1 Putnam ................................................................................................................ 297
10.3.2 Cummins............................................................................................................. 299
10.3.3 Fodor................................................................................................................... 301
10.3.4 The Functional Account and the Six Desiderata................................................. 302
10.4 An Application: Are Turing Machines Computers?................................................... 305
10.5 A Taxonomy of Computationalist Theses .................................................................. 306
10.6 Questions of Hardware ............................................................................................... 308
10.7 Conclusion .................................................................................................................. 310
BIBLIOGRAPHY....................................................................................................................... 311
xviii
LIST OF FIGURES
Figure 4-1. Diagrams of McCulloch and Pitts nets...................................................................... 54
Figure 4-2. Net explaining heat illusion ...................................................................................... 55
Figure 10-1. A NOT gate, an AND gate, and an OR gate. ........................................................ 255
Figure 10-2. Half (two-bit) adder............................................................................................... 264
Figure 10-3. Full (two-bit) adder. .............................................................................................. 265
Figure 10-4. The main components of a computer and their functional relations..................... 280
xix
1 COMPUTABILITY
1.1 Effective Calculability
This chapter introduces some fundamental notions related to computation, which will be used throughout
the dissertation. This first section is devoted to the pre-theoretical notion of effective calculability, or
computability by an effective procedure (computability for short). This informal notion motivates the
formally defined notion of Turing-computability, which I introduce in the following section. In the last
section, I briefly introduce the Church-Turing thesis, which says that the formal notion is an adequate
formalization of the informal one.
During the first decades of the 20th century, mathematicians’ interest in computable functions lay
in the foundations of mathematics. Different philosophical approaches were proposed. L. E. J. Brouwer
was the main supporter of intuitionism, according to which an existence proof for a mathematical object
was admissible only if constructive (Brouwer 1975). David Hilbert proposed his proof theory to
formalize in axiomatic fashion mathematical reasoning in an attempt to establish the foundations of
mathematics without endorsing Brouwer’s restrictions (Hilbert 1925, 1927, reprinted in van Heijenoort
1967; Hilbert and Ackermann 1928). This formalization allowed Hilbert to formulate rigorously the
decision problem for first-order logic. A decision problem requests a method for answering a yes-or-no
question concerning a domain of objects: “Given any sequence of symbols, is it a formula?” or “Given
any formula, is it provable?” A solution to a decision problem is an effective procedure: a uniform
method or algorithm specified by a finite set of instructions, by which any instance of the question can be
answered in a finite number of steps. “Effective procedure” is a term used by some mathematicians in
place of “algorithm”; an effective procedure cannot appeal to non-extensionally definable capacities like
intuition, creativity, or guesswork, and it always generates the correct answer.1
1 After the work of Turing and others, “effective procedure” was also used for procedures not guaranteed to generate
all values of a total function, that is, they calculated only the values of a partial function (cf. Wang 1974, p. 84).
1
Lacking a rigorous definition of “effective procedure,” mathematicians called it an “intuitive
concept” to distinguish it from formally defined mathematical concepts.2 Kurt Gödel proposed replacing
“effective procedures” with a rigorously defined concept, that of “recursive functions,” but he didn’t rule
out that some effective procedures might not be included within recursive functions (1931, 1934).
Alonzo Church (1936) and Alan Turing (1936-7) strengthened Gödel’s tentative identification of effective
procedures and recursive functions to a general thesis, now called the Church-Turing thesis. Based on the
Church-Turing thesis, Church and Turing proved that some functions are not computable. For example,
Turing pointed out that he and Church used different definitions but reached “similar conclusions,” i.e.,
that “the Hilbertian Entscheidungsproblem [i.e., the decision problem for first-order logic] can have no
solution” (Turing 1936-7, 116, 117, 145).3
The notion of effective procedure can be informally defined as a procedure with the following
properties4:
1) It uses a finite number of primitive operations for a finite number of times specified by a finite
number of deterministic instructions (i.e. instructions whose execution yields a unique next step in the
procedure).
2) Instructions are non-ambiguous and finite in length.
3) The procedure requires no intuitions about the subject matter (e.g. intuitions about properties of
numbers), no ingenuity, no invention, no guesses.
4) For any argument of the function computed by the procedure, the procedure is the same (uniformity).
5) If the procedure terminates, for any argument of the function computed by the procedure, the
procedure generates the correct value.
About the role of effective procedures in foundations of mathematics, especially Hilbert’s approach, see Hallett
1994, Sieg 1994, Shapiro 1983, 1995. For a history of foundations of mathematics, see Webb 1980, Mancosu 1998.
2 To refer to the intuitive notion of effective procedure, different authors used different terms. Instead of
“procedure,” some used “process,” “method,” or “rule.” Instead of “effective,” some used “finite,” “finite
combinatorial,” “mechanical,” “definite,” “constructively defined,” or “algorithmic.” Some of the terms used as
synomyms of “effectively calculable” are listed by Gödel 1965, 72; Kleene 1987a, pp. 55-56.
3 On the origin of CT and recursive function theory, see Davis 1982; Gandy 1988; Kleene 1979, 1987a; Sieg 1994,
1997; Piccinini 2003a.
4 The properties are somewhat redundant, but are kept separate for explicitness.
2
When we have a formalized language in which both a domain and operations over objects in that
domain are formally defined, we can talk about lists of formalized instructions and call them programs.
Programs are the formal replacement of algorithms or effective procedures. Because of this, programs are
said to implement algorithms (or procedures).
Not all mathematical procedures are effective procedures or algorithms. It may be possible to
specify sequences of operations that are not guaranteed to find all the values of a function for every
argument. These procedures, called heuristics, generate a search for the desired value: the search may
find the value of the function being computed or it may find an output that only approximates that value.
A caveat needs to be added about the relationship between programs (and procedures) and the
functions they compute. A program is a definition of a function, from its inputs to its outputs; when the
program doesn’t halt, the function isn’t defined. So, by definition any program computes the values of
the function that it defines: it implements an algorithm or effective procedure relative to that function.
However, typically a program is written to find values for a function that is defined independently of the
program’s existence. In such a case, the program may or may not be implementing an algorithm that
finds the values of that function. Many programs do not always find the values of the independently
defined function they are designed to compute, but rather they find approximations to those values. In
such cases, relative to the independently defined functions, those programs are said to implement not
algorithms but heuristics.5
5 The philosophical literature is not always clear on this point. For example:
The possibility of heuristic procedures on computers is sometimes confusing. In one sense, every digital
computation (that does not consult a randomizer) is algorithmic; so how can any of them be heuristic? The
answer is again a matter of perspective. Whether any given procedure is algorithmic or heuristic depends
on how you describe the task (Haugeland 1997, p. 14).
But whether a procedure (or program) is algorithmic or heuristic does not depend on how one describes its task.
Relative to its task, a procedure is algorithmic or heuristic depending on whether or not it is guaranteed to solve each
instance of the task. Instead, a program is always algorithmic with respect to the generation of its outputs given its
inputs.
Another example of confusion about this point is manifested by Dennett’s statement (1975, p. 83) that
human beings may not be Turing Machines (TMs), because humans may be implementing heuristics rather than
algorithms. This presupposes that TMs implement only algorithms and not heuristics. Now, it is true that every TM
implements an algorithm that generates its outputs given its inputs. But relative to the problem TMs are designed to
solve, TMs—like any other computing mechanisms—may well be implementing heuristics.
3
1.2 Computability Theory
This section reviews some notions and results of classical computability theory. Computability theory
studies what functions are computable, what mathematical properties they have, and the mathematical
properties of computing mechanisms. Computable functions are identified with the class of recursive
functions, inductively defined. As a specific example of computing mechanism, I will give Turing
Machines. Using Turing Machines and recursive functions, I will introduce the notion of universal
computing mechanism and the unsolvability of the halting problem.
1.2.1 Notation
{a1, a2, ... an} Set of n objects a1, ... an
(a1, a2, … an) List (or n-tuple) of n objects a1, … an
a ∈ A a is an element of set A
N Set of natural numbers 0, 1, 2, …
f: A1 → A2 f is a function from A1 to A2
Domain of f Set of all a such that (a, b) ∈ f for some b
Range of f Set of all of f(a) for a in the domain of f
Partial function on A Function whose domain is a subset of A
Total function on A Function whose domain is A
Alphabet Nonempty set Σ of objects called symbols
Word or string on Σ List of symbols on Σ, (instead of (a1, a2, … an), we write a1a2…an)
|u| = n, where u = a1a2… an n is the length of u
a1n concatenation of n symbols a1
Σ* Set of all words on alphabet Σ
Language on Σ Any subset of Σ*
uv, where u, v ∈ Σ* Concatenation of u and v
4
Predicate on a set A A total function P: A → N such that for each a ∈ A, either P(a) = 1 or
P(a) = 0, where 1 and 0 represent truth values
R = {a ∈ A|P(a)}, P a predicate on A R is the set of all a ∈ A such that P(a) = 1; P is called the
characteristic function of R
Pr (k) kth prime in order of magnitude
ΨMn(x1, …xn) n-ary function computed by TM program M; when n = 1 we omit n
Computability theory applies to general word functions f: Σ* → Σ’*, where Σ* is the set of all
words on alphabet Σ. Since words can be effectively encoded as natural numbers and vice versa (see
section 1.2.4 below for an example of such an encoding), in this section we follow the standard
convention of developing the theory with respect to number-theoretic functions f: N → N, without loss of
generality.6 Hence in this section, unless otherwise specified, “number” means natural number, and
“function” means function on natural numbers. For the exposition of the material in this section, I drew
mostly from Davis 1958 and Davis et al. 1994.
1.2.2 Recursive Functions
This section introduces the definition of the primitive recursive functions on the basis of three primitive
base functions and two primitive operations. Then, by means of one further primitive operation, the class
of partial recursive functions is defined.
The class of primitive recursive functions is defined inductively as follows.
Base functions:
Null function. n(x) = 0.
Successor function. s(x) = x + 1.
Projection functions. uin(x1, … xn) = xi.
Operations:
6 Computability theory can also be developed directly in terms of string functions (Machtey and Young 1978). This
possibility will be significant in the Chapter on Computation and Content.
5
Composition. Let f be a function of k variables and let g1, …gk be functions of n variables. Let:
h(x1, … xn) = f(g1(x1, … xn), … gk(x1, … xn)).
Then h is obtained from f and g1, … gk by composition.
Primitive recursion. Let f be a function of n variables and let g be a function of n+2 variables. Let:
h(x1, … xn, 0) = f(x1, … xn)
h(x1, … xn, t+1) = g(t, h(x1, … xn, t), x1, … xn)
Then h is obtained from f and g by primitive recursion.
Definition 1. A function f of n variables is primitive recursive if and only if it can be obtained
from the base functions by finitely many operations of composition and primitive recursion.
Examples of primitive recursive functions include addition, multiplication, exponentiation,
predecessor, and many other useful functions (see Davis et al. 1994, section 3.4).
In the present context, predicates are total functions whose values are 0 or 1 (representing true
and false). Any primitive recursive function whose values are 0 and 1 is called a primitive recursive
predicate. An example of a primitive recursive predicate is equality.
It can be easily shown, by induction on the definition of primitive recursive function, that every
primitive recursive function is total.
Next, we introduce a further operation:
Minimalization (unbounded). Let P be a predicate of n+1 variables. We write minyP(x1, … xn, y) for the
least value of y for which the predicate P is true if there is one. If there is no such value of y, then
minyP(x1, … xn, y) is undefined.
Unbounded minimalization of a predicate can easily produce a function that is not total. An
example is provided by subtraction:
x – y = minz(y + z = x),
which is undefined for x < y.
Definition 2. A function f is partial recursive if and only if it can be obtained from the base
functions by finitely many operations of composition, primitive recursion, and minimalization.
6
A partial recursive function that is total is called total recursive.
1.2.3 Turing Machines
Turing Machines (TMs) are perhaps the best-known computing mechanisms. They have two main
components. First, there is a two-way potentially infinite tape divided into squares; each square contains
one symbol (which may be an empty square). Second, there is an active device that can be in one of a
finite number of states. The active device acts on the tape in one of four ways: it reads the symbol on a
square, writes a symbol on a square, moves one square to the left, or moves one square to the right. TM’s
active devices operate in discrete time. At any instant, the active device reads the symbol on one of the
tape’s squares. Then, the symbol on that square and the device’s current state determine what the active
device does: what state it goes into and whether it moves left, or moves right, or writes a symbol on the
current square (and which symbol it writes). When this happens, we say that an active device responds to
its internal state and symbol on the tape. All TMs have this structure in common.
Although strictly speaking it is the active devices of TMs that perform operations (on the tape,
which is passive), for simplicity we follow the standard convention of ascribing activities to TMs tout
court. TMs are distinguished from one another by the alphabet they operate on, by the number of their
internal states, and more importantly by the particular actions they perform in response to their internal
states and the symbols on the tape. A description of the way a particular TM responds to a particular state
and symbol is here called an instruction. A set of instructions, which uniquely identifies a TM, is called a
TM program.
To avoid confusion, TMs should be kept distinct from the TM programs that describe their
behavior. Unlike digital computers, which compute by executing programs, ordinary TMs do not operate
by responding to the TM programs that describe their behavior. Ordinary TMs simply behave in the way
described by their TM programs; in other words, their behavior satisfies the instructions contained in their
TM program. A TM program identifies a computational process uniquely, and a TM that satisfies the
instructions listed in the program is its canonical implementation (i.e., the implementation given by
7
Turing). But the computations defined by TM programs can also be carried out by humans or machines
other than TMs.
Moreover, in section 1.2.4 we shall see that TM programs can be encoded using the alphabet that
TMs operate on, and then written on TM tapes. There are special TMs, called universal TMs, which can
respond to any TM program written on their tape so as to mimic the behavior of the TMs described by the
program. Since universal TMs do compute by responding to TM programs written on their tape, we say
that they execute TM programs. Needless to say, the behavior of universal TMs is also described by their
own TM programs, called universal TM programs. Universal TMs execute the programs written on their
tape, but not the universal TM programs that describe their behavior.
In formally defining TM tables, we will use the following ingredients:
Symbols denoting internal states of TMs’ active devices: q1, q2, q3, . . .
Symbols denoting symbols that TMs can print on the tape: S0, S1, S2, . . . The set of Si’s is our alphabet.
Symbols denoting primitive operations: R (move to right), L (move to left).
Expressions: finite sequences of symbols.
Instructions: expressions having one of the following forms:
(1) qi Sj Sk ql,
(2) qi Sj R ql,
(3) qi Sj L ql,
(4) qi Sj qk ql.
Quadruples of the first type mean that in state qi reading symbol Sj, the active device will print Sk and go
into state ql. Quadruples of the second type mean that in state qi reading symbol Sj, the active device will
move one square to the right and go into state ql. Finally, quadruples of the third type mean that in state qi
reading symbol Sj, the active device will move one square to the left and go into state ql.7
We are now ready to define (deterministic) TM programs, their alphabets, and their instantaneous
descriptions or snapshots:
7 Instructions of the fourth type serve to define special TMs called oracle TMs and will not be used here.
8
(Deterministic) TM program: set of instructions that contains no two instructions whose first two symbols
are the same.
Alphabet of a TM program: all symbols Si in the instructions except S0. For convenience, sometimes we
shall write S0 as B (blank), and S1 as 1.
Snapshot: expression that contains exactly one qi, no symbols for primitive operations, and is such that qi
is not the right-most symbol.
A snapshot describes the symbols on a TM tape, the position of the active device along the tape,
and the state of the active device. In any snapshot, the Si’s represent the symbols on the tape, qi
represents the state of the active device, and the position of qi among the Si’s represents the position of the
device on the tape. For any tape and any TM program at any computation step, there is a snapshot
representing the symbols written on the tape, the state of the device, and its position on the tape. At the
next computation step, we can replace the old snapshot by its successor snapshot, whose difference from
its predecessor indicates all the changes (of the tape, position, and state of the device) that occurred at that
step. A snapshot without successors with respect to a TM program M is called a terminal snapshot with
respect to that program.
Using the notion of snapshot, we can rigorously define computations by TM programs:
Computation by a TM program M: finite sequence of snapshots a1 . . . an such that 1≤i<p, ai+1 is the
successor of ai, and an is terminal with respect to M. We call an the resultant of a1 with respect to M.
For example, let M consist of the following instructions:
q1 S0 R q1,
q1 S1 R q1.
The following are computations of M, whose last line is the resultant of the first line with respect to M:
(1) q1S0S0S0
9
S
0q1S0S0
S0S0q1S0
S0S0S0q1
(2) q1S1S1S1
S1q
1S1S1
S1S1q1S1
S1S1S1q1
(3) q1S1S0
S
1q1S0
S1S0q1.
With each number n we associate the string n = 1n+1. Thus, for example, 4 = 11111. With each
k-tuple (n1, n2, …nk) of integers we associate the tape expression (n1, n2, …nk), where:
(n1, n2, …nk) = n1Bn2B …Bnk.
Thus, for example, (1, 3, 2) = (1, 3, 2) = 11B1111B111.
Given an initial snapshot and a program, either there is a computation or there isn’t (if there isn’t,
it’s because the list of snapshots is infinite).
Definition 3. An n-ary function f(x1, …xn) is Turing-computable if and only if there is a Turing
Machine M such that: f(x1, …xn) is defined if and only if there is a computation of M whose first
snapshot is q1(x1, … xn) and whose resultant contains nj+1 occurrences of the symbol 1, where f(x1, …xn)
= nj. We write:
f(x1, …xn) = ΨMn(x1, …xn).
10
Turing-computability and partial recursiveness are equivalent notions in the following sense. A
function is partial recursive if and only if it is Turing-computable, and it is total recursive if and only if it
is a total Turing-computable. In one direction, this is shown by constructing TM programs computing
each of the base functions and by showing that TM programs can be manipulated in ways corresponding
to the three operations (for the details of the construction, see Davis 1958). The other direction is
addressed in the following section.
1.2.4 Gödel Numbers of TM Programs
One way to develop the theory of TM programs is by using recursive functions. I use a method,
developed by Gödel (1931), that allows us to use natural numbers as a code for TM instructions, and
therefore for TM programs. By studying the properties of TM programs in this way, we will demonstrate
the results that we are interested in, namely the existence of universal TMs and the unsolvability of the
halting problem. The method followed here has the great advantage of avoiding long and laborious
mathematical constructions.
The basic symbols used in formulating TM programs are the following:
R, L
S0, S1, S2, …
q1, q2, q3, …
We associate each of these symbols to an odd number ≥ 3, as follows:
3 R
5 L
7 S0
9 q1
11 S1
13 q2
etc.
11
Hence, for any expression M there is now a finite sequence of odd integers a1, a2, … an associated to M.
Now we’ll associate a single number with each such sequence and hence with each expression.
Definition 4. Let M be an expression consisting of the symbols a1, a2, … an. Let b1, b2, … bn be
the corresponding integers associated with these symbols. Then the Gödel number of M is the following
integer:
n
r = ∏ Pr (k)ak
k=1
We write gn(M) = r, and M = Exp (r). If M is the empty expression, we let gn(M) = 1.
Definition 5. Let M1, M2, … Mn be a finite sequence of expressions. Then, the Gödel number of
this sequence of expressions is the following integer:
n
r = ∏ Pr (k)gn(Mk)
k=1
It is easy to prove that any expression and any sequence of expressions have a unique Gödel
number. Since TM programs are sets of instructions not lists of them, any TM program consisting of n
instructions has n! Gödel numbers.
Definition 6. For each n>0, let Tn(z, x1, … xn, y) be the predicate that means, for given z, x1, …
xn, y, that z is a Gödel number of a TM program Z, and that y is the Gödel number of a computation, with
respect to Z, beginning with snapshot q1(x1, … xn).
These predicates express the essential elements of the theory of TM programs.
Davis 1958 contains the detailed construction proving that for each n>0, Tn(z, x1, … xn, y) is
primitive recursive, that every Turing-computable function is partial recursive, and that every total
Turing-computable function is total recursive.
12
1.2.5 Universal TM programs
We are now ready to demonstrate that there are universal TMs, which compute any function computable
by a TM. Consider the partial recursive binary function f(z, x) = u11(minyT(z, x, y)). Since this function
is Turing-computable, there is a TM program U such that:
ΨU2(z, x) = f(z, x)
This program is called universal TM program. It can be employed to compute any partially computable
(singulary; but generalizable to n-ary) function as follows: If Z0 is any TM program and if z0 is a Gödel
number of Z0, then:
ΨU2 (z0, x) = ΨZ0(x)
Thus, if the number z0 is written on the tape of U, followed by the number x0, U will compute the number
ΨZ0 (x0).
1.2.6 Unsolvability of the Halting Problem
We now discuss the function HALT(x, y), defined as follows. For a given y, let P be the TM program
such that gn(P) = y. Then HALT(x, y) = 1 if ΨP(x) is defined and HALT(x, y) = 0 otherwise. In other
words HALT(x, y) = 1 if and only if the TM program with Gödel number y eventually halts on input x,
otherwise it’s equal to 0. We now prove the unsolvability of the halting problem.
Theorem. HALT(x, y) is not a recursive function.
Proof. Define the total function g(x) = HALT(x, x), and the partial function h(x) = 0 if g(x) = 0, h(x)
undefined if g(x) = 1. If h is partial recursive, then there is a TM program P’ with Gödel number i such
that for all x, h(x) = ΨP’(x). But then:
h(i) = ΨP’(i) = 0 if and only if g(i) = 0 if and only if ΨP’(i) is undefined,
which is a contradiction. Therefore, h cannot be partial recursive, so that g and hence HALT cannot be
total recursive. QED
13
This theorem gives us an example of a function that is not computable by a TM program.
Computability theory shows that there are infinitely many such functions. Assuming the truth of the
Church-Turing thesis, which will be discussed in the next section, we conclude that there is no algorithm
computing the halting function. The same holds for any other non-Turing-computable total function.
1.3 The Church-Turing Thesis
Turing (1936-7) introduced his machines as a way to make precise the informal notion of algorithmic
computability or effective calculability, as I introduced them in the first section. Church (1936) proposed
a similar thesis using recursive functions, which, as we’ve seen, are computationally equivalent to TMs.
After its proponents, Stephen Kleene (1952) dubbed this the Church-Turing thesis:
(CT) A function is effectively calculable if and only if it is Turing-computable.
CT is generally accepted among mathematicians and computer scientists on what they consider
overwhelming evidence in its favor.
In summary, this chapter introduced the informal notion of effective calculability and its formal
counterpart—Turing computability. According to the canonical view, CT connects the informal notion of
effective calculability, or computability by effective procedure, with the formal one of Turing-
computability. There can be no rigorous proof of CT, but there is overwhelming evidence in its favor. In
Chapter 7, I will list the evidence for CT, discuss the most important alternatives to this canonical view,
and I will conclude with a cautious endorsement of a clarified version of the canonical view. I will then
discuss the relevance of CT for computationalism.
Before we get to the detailed discussion of CT, we shall take a close look at how in the 1940s, the
notions described in this chapter were used to formulate a novel theory of the brain, computationalism,
according to which the brain is a mechanism that performs computations.
14
2 WARREN MCCULLOCH ON LOGIC, MIND, AND BRAIN, CA. 1920-1936
2.1 Introduction
In the 1940s, Turing’s notion of computability and of a universal computing mechanism—which were
reviewed in Chapter 1—played an important role in the history of computationalism, namely the view that
the brain is a computing mechanism. The central figure in the development of computationalism was
neurophysiologist and psychiatrist Warren McCulloch. Other key figures, besides Turing and
McCulloch, were mathematicians Norbert Wiener and John von Neumann. This chapter begins to tell
how McCulloch and others developed computationalism.
Despite the centrality of computationalism to many disciplines, its early historical development,
which took place during the early 1940s, has received little attention and remains poorly understood.
Most of the existing work focuses on symbolic artificial intelligence (AI) and the cognitivist movement
since the late 1950s (McCorduck 1979, Gardner 1985, Crevier 1993). This research pays little attention
to the origin of computationalism in the 1940s and early 1950s. One partial exception is the work of
Dupuy (2000), who urges historians to look at the cybernetic movement as the source of
computationalism. But Dupuy’s work is more a philosophical argument in favor of a brand of cognitive
science based on the study of “complexity” than a scholarly study of the history of computationalism (cf.
Piccinini 2002).
Outside of the history of cognitive psychology and symbolic AI, there is some work on the
relation between computationalism and logic (Webb 1980), but nothing has been written on
computationalism in neuroscience, even though computationalism was originally proposed and discussed
by a neurophysiologist as a theory of the brain. Another limitation of the existing literature is that it often
addresses the origin of computationalism by listing the ideas of intellectual-heroes like Hobbes, Leibniz,
and Babbage, usually culminating in Turing’s contributions (the best example of this kind is Davis 2000).
It’s time to move beyond this genius-centered history and study how the dreams of those thinkers
15
eventually turned into computationalism as a conceptual framework for several research programs, which
involved the creation of new scientific communities and institutions, and even new disciplines.
Aside from published sources, I have relied on collections of unpublished material pertaining to
Alan Turing, Warren McCulloch, John von Neumann and Norbert Wiener.1 My research builds on recent
scholarship in the history of computability theory (Sieg 1994, 1997; Hallett 1994; Shapiro 1995),
Turing’s early work on mechanical intelligence (Copeland 2000, Copeland and Proudfoot 1996), the
history of biophysical and connectionist modeling (Abraham 2001a, 2001b, 2002; Arbib 2000; Frank
1994; Smalheiser 2000), and the discovery of neural mechanisms (Craver and Darden 2001, Craver
2001). Another resource is the recently published interviews with some early connectionist modelers
(Anderson and Rosenfeld 1998). My research complements this recent work by showing how early
attempts at mathematical modeling of the brain, the development of new mathematical and modeling
tools, and new hypotheses about memory mechanisms came together to form a novel conceptual and
methodological framework for studying the functional organization of the brain.
In this historical part of the dissertation, I will cover the period going roughly from the mid-
1930s, when a number of scientists interested in computation and the brain began to meet one another, to
1946, when these scientists formed a small scientific community, whose members were sharing their
work and organizing conferences.
2.2 Background
McCulloch was an eclectic neurophysiologists and psychiatrist, whose main goal was to explain the mind
in terms of neural mechanisms. This was not a new project: for instance, an illustrious antecedent was
1 The Alan Mathison Turing Papers are at Archives of King’s College, Cambridge, UK. The Warren S. McCulloch
Papers are at the Library of the American Philosophical Society, Philadelphia, PA. The Papers of John von
Neumann are at the Library of Congress, Washington, DC. The Norbert Wiener Papers are at the Institute Archives
and Special Collections, MIT Libraries, Cambridge, Massachusetts, under MC 22. Whenever possible, I will
indicate the box and file folder (abbreviated ff.) of the unpublished documents I refer to.
16
Sigmund Freud’s unpublished Project for a Scientific Psychology (Freud 1895).2 Freud thought that
neuronal activity must embody ideas. Since the nineteenth century, neurophysiologists realized that nerve
fibers carry electrical pulses, and that these pulses can have either excitatory or inhibitory actions on other
nerve fibers. By the end of the nineteenth century, neurophysiologists reached a kernel of consensus that
the nervous system is made out of individual cells called neurons, which are connected together in vast
networks. The pulse trains on nerve fibers were largely interpreted as carrying meaningful messages, but
there was no detailed account of how these messages were processed by the brain. Freud’s theory was
that energy was discharged from neuron to neuron, and that a neuron’s energy level corresponded to the
activation of an idea in the mind.
McCulloch’s views about mind and brain originated during the 1920s and reached maturity at the
beginning of the 1940s, half a century after Freud’s. Unlike Freud, McCulloch did not primarily rely on
the energy levels and energy flow between neurons. Instead, McCulloch thought that the relevant entity
transmitted from neurons to neurons was “information.” McCulloch’s notion of “information” derived
from the power of formal logic to derive conclusions from premises. According to McCulloch, an
important aspect of mind was “formal,” and was constituted by inferences that can be modeled by a
logical calculus. Since logical calculi could be implemented in computing mechanisms, McCulloch
thought the brain must be a computing mechanism embodying a logical calculus that constituted the
formal aspect of the mind. McCulloch didn’t stop at his formulation of a general theory of the brain: he
spent a good portion of his time after 1943 devising testable hypotheses about specific neural mechanisms
and how they might explain some mental function. So, McCulloch was one of the originators not only of
computationalism as a doctrine, but also of the methodology of building models of neural or cognitive
mechanisms, based on the computationalist doctrine, to explain various aspects of the mind.
Understanding McCulloch's project is crucial to understanding both the cybernetic movement and the
subsequent history of AI and cognitive science.
2 For an account of Freud’s Project and its significance for Freud’s psychoanalysis, see Fancher 1973.
17
The difference between McCulloch and Freud’s projects was made possible by two main
historical developments: the rise of mathematical logic and the establishment of the all-or-none law of
neural activity.
In the years 1910-1013, Alfred North Whitehead and Bertrand Russell published their Principia
Mathematica. It contained a powerful formal logical system, whose purpose was to prove all
mathematical theorems on logical grounds. Whether Whitehead and Russell succeeded in deriving
mathematics from logic alone remained controversial, but the unquestionable deductive power of their
formal system popularized mathematical logic in both philosophy and mathematics. Mathematicians
developed and applied Whitehead and Russell’s techniques to study the foundations of mathematics,
whereas philosophers applied those techniques to problems in epistemology, philosophy of language, and
other areas.3
In subsequent work, Russell himself developed a view called logical atomism, according to which
ordinary physical objects should be understood as constructions made out of logical atoms (such as “red
here now”) by means of logical techniques. Also, according to Russell, our knowledge of ordinary
physical objects could be reduced by logical means to our knowledge of sense data, analogously to how
mathematical theorems could be derived from the logical system of the Principia (Russell 1914).4 The
most detailed and rigorous attempt to carry out this epistemological project was made by Rudolf Carnap
(Carnap 1928).5 We shall see that both the Principia and Russell’s epistemological project motivated
McCulloch’s project for reducing the mind to the brain.6
3 For more details on the Principia, see Irvine 2002.
4 For more details on Russell’s philosophy, see Irvine 2001.
5 Clark Glymour has gone as far as ascribing Carnap the first computational theory of mind:
The first explicitly computational theory of cognitive capacities is Rudolf Carnap’s Der Logische Aufbau
der Welt. Carnap’s book offered an account of how concepts of color, sound, place, and object could be
formed from elements consisting of gestalt experiences and a relation (“recollection of similarity”) between
such experiences. The theory was given as a logical construction, but also as what Carnap called a “fictive
procedure”. The procedural characterization is in fact a series of algorithms that take as input a finite list of
pairs of objects (the “elementary experiences”) such that there is a recollection of similarity between the
first and the second member of each pair. The book was of course written before there were computers or
programming languages, but it would nowadays be an undergraduate effort to put the whole thing into
LISP code (Glymour 1990, p. 67).
18
In the first quarter of the 20th century, much work in neurophysiology focused on the nature of the
impulses traveling through nerve fibers. In 1926, Edgar Adrian started publishing his groundbreaking
recordings from nerve fibers (Adrian 1926, Adrian and Zotterman 1926). Adrian’s work was reported in
newspapers in London and New York, and Adrian continued to extend his results and publish them
through the late 1920s. In 1930, A.V. Hill proposed Adrian for the Nobel Prize in physiology or
medicine, which was awarded to Adrian (shared with Charles Sherrington) in 1932. Adrian’s work was
being publicly recognized as the definitive experimental demonstration of the all-or-none law, namely
“that the intensities of sensation and response depend simply upon the number of nerve impulses which
travel to or from the nervous system, per unit of time.”7 We shall see that around 1929, the all-or-none
law would have an important impact on McCulloch’s thinking about brain and mind.
2.3 Logic, Epistemology, and the Brain
McCulloch’s interest in logic and epistemology goes as far back as his college years. In 1917, he was a
freshman at Haverford College, where he studied mathematics and medieval philosophy. He was fond of
recalling an exchange he had at Haverford with a philosophy teacher, Rufus Jones. McCulloch told his
teacher that he wanted to know, “What is a number, that a man may know it; and a man, that he may
know a number?” Jones replied: “Friend, thee will be busy as long as thee lives.” McCulloch described
his work of a lifetime as pursuing and accomplishing that project, and more generally of answering “the
general questions of how we can know anything at all.”8
See also Glymour, Ford, and Hayes 1995. Glymour’s point here may be justified conceptually, but there is little
evidence that Carnap’s Aufbau played a very important role in the history of computationalism.
6 Lettvin, a long-time friend of McCulloch and Pitts, wrote:
Strongly in the minds of both McCulloch and Pitts were the notions of Russell as contained in his essays on
mind, the notions of Peirce, and to a great extent, the notions of Whitehead, in particular as regards the
structure of mind and experience (Lettvin 1989a, p. 12).
7 Letter by A.V. Hill to the Nobel Committee, cited by Frank 1994, p. 209. For a detailed history of the
experimental demonstration of the all-or-none law, see Frank 1994.
8 Biographical Sketch of Warren S. McCulloch, ca. 1942. Warren S. McCulloch Papers, ff. Curriculum Vitae.
McCulloch 1961; McCulloch 1974, pp. 21-22. The episode involving his philosophy teacher is also cited by many
friends of McCulloch.
19
After serving in the Navy during World War I, in 1920 McCulloch transferred to Yale, where he
majored in philosophy and minored in psychology in 1921. Among other things, he recalled reading
Immanuel Kant’s Critique of Pure Reason, whose notion of synthetic a priori knowledge would be a
major influence on his thinking about the brain. He also studied Aristotle, the British Empiricists, Charles
Sanders Peirce, Georg Wilhelm Friedrich Hegel, and Whitehead and Russell’s Principia Mathematica.
He thought that by defining numbers as classes of classes and formally deriving mathematics from logic,
Whitehead and Russell had answered satisfactorily his first question, “What is a number, that a man may
know it?” He would devote his efforts to the other question.9
That other question was, what is a man, that he may know a number? McCulloch’s first step
towards an answer was a sort of mental atomism, according to which there must be atomic mental events
that carry truth values. More complex mental events can be logically constructed out of atomic ones. It is
not known exactly when or how McCulloch developed his mental atomism, but McCulloch’s
retrospective on his philosophical education suggests that even at this early stage of his life, he freely
interpreted previous philosophical work along mental atomistic lines:
I turned to Russell and Whitehead—(Principia Mathematica)—who in the calculus of atomic
propositions were seeking the least event that could be true and [sic] false. Leibnitz’s problem of
the petit perception had become the psychophysical problem of the just noticeable difference,
called JND, which has since been found in the middle range of perception to be roughly
proportional to the size of the stimulus.10
JND is the smallest difference between two stimuli that can be discriminated by a subject. McCulloch
continued his retrospective by arguing that JND entailed that there were atomic mental signals:
In searching for these unit signals of perception I came on the work of René Descartes… For him
the least true or false event would have been his postulated hydraulic pulse in a single tube, now
called an axon.11
These comments are revealing in several ways. They show McCulloch’s keen interest in philosophy as
well as his attempt to recruit past philosophical works to serve his own projects. They also exhibit
9 McCulloch 1974, p. 22. Lettvin 1989a, p. 11.
10 McCulloch 1974, pp. 24-25.
11 McCulloch 1974, p. 26.
20
McCulloch’s elliptical writing style and some bits of his personal jargon. He did not explain what he took
a “least true or false event” to be.
McCulloch said that in 1920, he tried to construct what he called “a logic for transitive verbs of
action and contra-transitive verbs of perception,” on which he continued to work until February 1923.12
By then, he was completing an M.A. in psychology at Columbia University, which he had started after his
B.A. At Columbia he became very interested in physiological psychology and studied mathematics,
physics, chemistry, and neuroanatomy.13 By the time he took his M.A. from Columbia, “he had become
convinced that to know how we think and know, he must understand the mechanism of the organ whereby
we think and know, namely the brain.”14 After that, he enrolled in medical school at Columbia, with the
goal of learning enough physiology to understand how brains work.15 Both as a student of medicine and
later as an intern, he mostly focused on neurology and psychiatry, with the hope of developing a theory of
neural function.
While pursuing his medical studies, and after he abandoned his project of a logic of verbs of
action and perception, McCulloch allegedly developed a psychological theory of mental atoms. He
postulated atomic mental events, which he called “psychons,” in analogy with atoms and genes:
My object, as a psychologist, was to invent a kind of least psychic event, or “psychon,” that
would have the following properties: First, it was to be so simple an event that it either happened
or else it did not happen. Second, it was to happen only if its bound cause had happened … that
is, it was to imply its temporal antecedent. Third, it was to propose this to subsequent psychons.
Fourth, these were to be compounded to produce the equivalents of more complicated
propositions concerning their antecedents.16
McCulloch said he tried to develop a propositional calculus of psychons. Unfortunately, the only known
records of this work are a few passages in later autobiographical essays by McCulloch himself.17 In the
absence of primary sources, it’s difficult to understand the exact nature of McCulloch’s early project. A
12 He did not say why he dropped the project, though he said he encountered “many pitfalls” (McCulloch 1974, pp.
28-29). Unfortunately, I found no record of this work.
13 McCulloch 1974, p. 30.
14 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.
15 McCulloch 1974, p. 30.
16 McCulloch 1961, p. 8.
17 On McCulloch’s early psychological theory, see McCulloch 1961, pp. 8-9; McCulloch 1965, pp. 392-393;
Abraham 2002, p. 7.
21
key point is that a psychon is “equivalent” to a proposition about its temporal antecedent. In more
modern terminology, McCulloch seemed to think that a psychon had a propositional content, which
contained information about that psychon’s cause. A second key point is that a psychon “proposes”
something to a subsequent psychon. This seems to mean that the content of psychons could be
transmitted from psychon to psychon, generating “the equivalents of” more complex propositions. These
themes would play an important role in McCulloch’s mature theory of the brain.
McCulloch did his internship in organic neurology under Foster Kennedy at Bellevue Hospital in
New York, where he finished in 1928.18 While working as an intern, he “was forever studying anything
that might lead me to a theory of nervous function.”19 He developed a long-term interest in closed loops
of activity in the nervous system, namely activity flowing through neurons arranged in closed circuits.
Since neural activity flowing in circles along close circuits could feed itself back onto the circuit, thereby
sustaining itself indefinitely, McCulloch called this process “reverberation.” At that time, there was no
evidence of closed anatomical loops within the central nervous system, although McCulloch attributed to
Ramón y Cayal the hypothesis that they existed.
The tremors of Parkinson’s disease, McCulloch thought, could be explained by closed loops of
activity connecting the spinal cord and the contracting muscles. With his fellow intern Samuel Wortis,
McCulloch discussed whether the loops that would explain Parkinson’s were a local “vicious circle”—
namely a closed loop involving only the spine and the muscles but not the brain—or the effect of a close
loop of activity in the central nervous system, which sent a cyclical signal to the region of the body
affected by the tremor. McCulloch and Wortis wondered whether other diseases, such as epilepsy, could
be explained by closed loops of neural activity. They did not consider that closed loops of activity could
be a normal feature of the nervous system, in part because their discussions were taking place before
Lawrence Kubie published the first theoretical paper postulating closed loops in the central nervous
18 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.
19 McCulloch 1974, p. 30.
22
system to explain memory (Kubie 1930).20 Later in his life, McCulloch would hypothesize closed loops
as explanations for many normal neural functions.
By the end of his internship at Bellevue in 1928, McCulloch “had become convinced that to
understand the workings of the nervous system he needed more physics and chemistry.”21 During the
following couple of years, McCulloch studied those subjects, as well as more mathematics, at New York
University. During the same period, he taught physiological psychology at Columbia Extension in
Brooklyn and he did research with Frank Pike in the Laboratory of Neurosurgery at Columbia and with
Wortis in the Laboratory of Experimental Neurology at Bellevue.22
In 1929, McCulloch met mathematician R. V. Hartley of Bell Labs. Hartley had defined, for
engineering purposes, a quantifiable notion of information divorced from meaning. In Hartley’s words,
his was “a definite quantitative measure of information based on physical considerations alone.”23
Hartley studied how much information can be transmitted using given codes as well as the effect of noise
on the transmission of information. As McCulloch knew, Hartley’s work was an important part of the
background against which Shannon later formulated his mathematical theory of communication (Shannon
1948; Shannon and Weaver 1949).24 This shows that early on, McCulloch was interested in engineering
problems of communication and acquainted with attempts to formulate a mathematical theory of
information.
In 1931, an otherwise unspecified “friend” of McCulloch’s translated to him and others a recently
published paper by Kurt Gödel “on the arithmetization of logic.”25 This shows that McCulloch was
keeping up with important results in the foundations of mathematics.
20 For an account of these events, see McCulloch 1974, pp. 30-31.
21 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.
22 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.
23 Hartley 1929, p. 538.
24 McCulloch 1974, 32. For more on Hartley’s work and its relation to Shannon’s, see Aspray 1985, pp. 120-122.
25 McCulloch 1974, p. 32. The translated paper was probably Gödel’s famous paper on incompleteness (Gödel
1931), as indicated by a deleted phrase in a draft of McCulloch’s 1974 paper (Warren S. McCulloch Papers, Series
V Miscellaneous, Box 3).
23
McCulloch’s views about a calculus of psychons underwent an important transformation in 1929.
It occurred to him that the all-or-none electric impulses transmitted by each neuron to its neighbors might
correspond to the mental atoms of his psychological theory, where the relations of excitation and
inhibition between neurons would perform logical operations upon electrical signals corresponding to
inferences of his propositional calculus of psychons. His psychological theory of mental atoms turned
into a theory of “information flowing through ranks of neurons.”26
This was McCulloch’s first attempt “to apply Boolean algebra to the behavior of nervous nets.”27
The brain would embody a logical calculus like that of Whitehead and Russell, which would account for
how humans could perceive objects on the basis of sensory signals and how humans could do
mathematics and abstract thinking. This was the beginning of McCulloch’s search for the “logic of the
nervous system,” on which he kept working until his death. A major difficulty to the formulation of his
logical calculus was the treatment of closed loops of neural activity. McCulloch was trying to describe
the causal structure of neural events by assigning temporal indexes to them. But he thought a close loop
meant that one event was its own ancestor, which did not make sense to him. He wanted “to close the
loop” between chains of neuronal events, but did not know how to conceive of the events in the close
loops. He would not find a solution to this difficulty until he met Walter Pitts in the early 1940s.28
In 1932-3, McCulloch was at Rockland State Hospital, still in New York City, to earn money as a
psychiatrist. There, he met the German psychiatrist Eilhard von Domarus, who was earning his
philosophy Ph.D. at Yale under Filmer Northrop, with a dissertation On the Philosophic Foundation of
Psychology and Psychiatry. Von Domarus’s dissertation interpreted psychoses, such as schizophrenia, as
26 McCulloch 1974, p. 32.
27 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae. The
same Biographical Sketch also says that this was the time when McCulloch “attempted to make sense of the logic of
transitive ver[b]s,” which conflicts with what he wrote in his later autobiographical essays. Given the lack of
primary sources and given McCulloch’s inconsistencies in his writings, it is hard to date his early work with
certainty. But in spite of some inconsistencies with dates, in all his relevant writings McCulloch emphasized his
early interest in logic and his attempts to apply logic to psychology and later to a theory of the brain. It is thus hard
to believe Lettvin when he wrote that until McCulloch worked with Pitts in the early 1940s, McCulloch had not
applied “Boolean logic” to the working of the brain (Lettvin 1989a, p. 12). Lettvin gave no evidence for this claim.
Since Lettvin met McCulloch only around 1940, Lettvin may never have discovered McCulloch’s early efforts in
this direction.
28 For an account of these events, see McCulloch 1961; McCulloch 1974, pp. 30-32; Arbib 2000, p. 213.
24
logical disturbances of thought. But Von Domarus could not write well in English, so McCulloch helped
him write his dissertation. Years later, in commenting on von Domarus’s dissertation, McCulloch found
that “no other text so clearly sets forth the notions needed for an understanding of psychology, psychiatry
and finite automata.”29
2.4 Strychnine Neuronography and the Functional Organization of the Brain
Until 1934, McCulloch was an ambitious psychiatrist with original ideas but little scientific track record.
He had only four publications in scientific journals. In 1934, he moved to Yale to work in Joannes
Dusser de Barenne’s Laboratory of Neurophysiology. Dusser de Barenne was a distinguished Dutch
neurophysiologist who had moved from Holland to Yale in 1930.30 McCulloch worked at Yale until
shortly after Dusser de Barenne’s death in 1940. McCulloch’s work during those years launched his
academic career.31
With Dusser de Barenne, McCulloch worked mostly on mapping the connections between brain
areas. To discover those connections, Dusser de Barenne had developed the method of strychnine
neuronography. When strychnine was applied to one brain area, it caused neurons to fire. The pulses
from those neurons would activate whichever areas were connected to the first area. By applying
strychnine to a cortical area and recording the activity of other brain areas, it was thus possible to map the
projections of any area of the cortex. Dusser de Barenne and McCulloch mapped cortico-cortical
connections as well as connections between cortical areas and other areas of the monkey brain.
McCulloch continued working with strychnine neuronography after leaving Yale for Chicago, where he
worked with Percival Bailey, Gerhard von Bonin, and others. He published over forty papers on the
subject between 1934 and 1944. In 1944, McCulloch published two review articles on cortical
connections, one in the journal Physiology (McCulloch 1944a), the other in a reference volume on The
29 Thayer 1967, p. 350, cited by Heims 1991, p. 133; see also McCulloch 1974, pp. 32-3.
30 McCulloch 1940, p. 271.
31 McCulloch’s many publications on neurophysiology are reprinted in his Collected Works (McCulloch 1989).
25
Precentral Motor Cortex (McCulloch 1944b). His work using strychnine neuronography established him
as a leading expert on what he called “the functional organization of the brain.”32
McCulloch’s work in Dusser de Barenne’s lab explicitly connected him to an intellectual lineage
in neurophysiology that goes from Wilhelm von Helmholtz to Rudolf Magnus to Dusser de Barenne.
These authors were concerned with the physiological foundations of perception and knowledge, including
the idea that Kant’s synthetic a priori knowledge is grounded in the anatomy and physiology of the brain.
This idea was well expressed in Magnus’s lecture “The Physiological A Priori” (Magnus 1930), which
McCulloch knew and cited. Dusser de Barenne consciously inherited the quest for the physiological a
priori from his mentor Magnus and transmitted it to his pupil McCulloch.33 McCulloch saw himself as
continuing the tradition from Kant to Dusser de Barenne, and would refer to his theory of the brain as
solving the problem of the physiological a priori. Partly because of this, he called his intellectual
enterprise experimental epistemologyin the 1950s, a sign reading “experimental epistemology” hung
from his MIT lab’s door.34
McCulloch said that the work with Dusser de Barenne was important to him, because it made him
deal with brains and their activity: “For me it proved that brains do not secrete thought as the liver
secretes bile, but that they compute thoughts the way computing machines calculate numbers.”35
Unfortunately, he did not explain in what sense or by what means this neurophysiological work “proved”
that brains compute thoughts.
Some clue as to the relationship between McCulloch’s work with Dusser de Barenne and
McCullolch’s view that the brain performs computations was offered by Jerome Lettvin. Lettvin—one of
McCulloch’s life-long collaborators and friends—offered an explanation that he is likely to have heard
32 For more on Dusser de Barenne and McCulloch’s neurophysiological work, see Gershwind 1989; Abraham 2002,
pp. 8-11.
33 According to McCulloch, Dusser de Barenne had worked “intimately” with Magnus (McCulloch 1940, p. 270;
McCulloch 1974, p. 22).
34 Interview with Lettvin, in Anderson and Rosenfeld 1998. McCulloch put it as follows:
The main theme of the work of my group in neurophysiology in the Research Laboratory of Electronics at
the Massachusetts Institute of Technology has been in this tradition, namely, experimental epistemology,
attempting to understand the physiological foundation of perception (McCulloch 1974, pp. 22-23).
35 McCulloch 1974, p. 33.
26
from McCulloch. According to Lettvin, two aspects of McCulloch’s neurophysiological work were
especially relevant. First, there was the observation of nerve specificity: stimulating specific nerves or
specific brain areas would lead to excitation (or inhibition) of very specific other neural areas, or give rise
to specific movements, or specific sensations. This suggested that there were pre-existing paths between
specific portions of the nervous system, carrying specific pulses from certain areas to others, and that
those pulses gave rise to “all the kinds of perception, thinking, and memory that we enjoy.”36 Second,
synaptic action—i.e. the action occurring between neurons—was irreversible, occurring only in one
preferred direction and not in the reverse direction. Pulses could travel through the pre-existing paths
only in one direction. According to Lettvin:
It would be impossible to devise a logical system in which the connections were reversible; that
is, active informationally in both directions. So, to McCulloch’s mind, the existence of a single
direction in the nervous system for information reinforced the idea of an essentially logical
device.37
McCulloch was presumably interpreting his neurophysiological observations on the basis of his pre-
existing assumptions about information flow through ranks of neurons. According to McCulloch, by the
time he went to work with Dusser de Barenne, McCulloch had already reached the conclusion that neural
activity can be modeled by logic. As much as his neurophysiological observations were compatible with
such a view, McCulloch’s claim that they “proved” it seems to be an overstatement.
While working as an experimental neurophysiologist at Yale, McCulloch established connections
with colleagues in the field. An especially important one was with a young Mexican researcher, Arturo
Rosenblueth, who was working with Walter Cannon at Harvard Medical School on homeostasis and other
topics. At least by 1938, McCulloch and Rosenblueth knew each other and had initiated a dialogue over
experimental methods and results in neurophysiology.38 In the summer of 1941, McCulloch visited
36 Lettvin 1989a, p. 14.
37 Ibid.
38 Letter by McCulloch to J. F. Tönnies, dated April 11, 1938. Warren S. McCulloch Papers, ff. Tönnies. Letter by
McCulloch to Rosenblueth, dated December 22, 1939. Warren S. McCulloch Papers, ff. Rosenblueth.
27
Rosenblueth’s lab and they ran a few experiments together.39 By then, they had also started discussing
more theoretical topics, such as von Domarus’s dissertation on the foundations of psychiatry.40
McCulloch and Rosenblueth developed a friendship that lasted for decades.
Rosenblueth shared McCulloch’s dissatisfaction with the current lack of theory in
neurophysiology, which he expressed to McCulloch as follows:
It is always difficult to strike the right balance between experiment and hypothesis, but it seems
to me, in the main, that a good many of our colleagues—perhaps even ourselves—do not do
enough thinking about the large number of experiments carried out. In other words, I found our
discussions very stimulating.41
Rosenblueth’s laudatory reference to his conversations with McCulloch right after his complaint about the
lack of hypotheses in neurophysiology suggests that in conversation, McCulloch had manifested a more
theoretical bent than many of their colleagues, which Rosenblueth appreciated. This would not be
surprising, for McCulloch cared much about theory and throughout his life, he manifested and publicly
defended a tendency to formulate hypotheses and theories even in the absence of data to test them.
At Yale, McCulloch also attended a philosophical seminar for research scientists organized by
Filmer Northrop, who was von Domarus’s old advisor. At one of those seminars, Frederic Fitch, a
distinguished logician from Yale’s Philosophy Department, presented the theory of deduction of
Principia Mathematica. McCulloch also attended advanced lectures by Fitch on logical operators and
urged Fitch to work on the logic of neural nets.42
While McCulloch was at Yale, he became acquainted with the work of J. H. Woodger (1937),
who advocated the axiomatic method in biology. In a letter to a colleague written in 1943, McCulloch
wrote:
I personally became acquainted with Woodger because the great interest of the biologists in Yale
had led to his coming thither to tackle some of their problems. When he finally departed, it was
not because they were not convinced of the value of his attempt but because he was convinced
39 Two letters by Rosenblueth to McCulloch, dated June 21 and September 5, 1941. Warren S. McCulloch Papers,
ff. Rosenblueth.
40 Letter by McCulloch to Rosenblueth, dated May 1, 1941. Letter by Rosenblueth to McCulloch, dated December
3, 1941. McCulloch papers, ff. Rosenblueth.
41 Letter by Rosenblueth to McCulloch, dated September 5, 1941. Warren S. McCulloch Papers, ff Rosenblueth.
42 Heims 1991, p. 34ff.
28
that the ambiguity of their statements prevented logical formulation. It was to discussions with
him and with Fitch that I owe much of my persistence in attempting a logical formulation of
neuronal activity. Until that time I had merely used the nomenclature of the Principia
Mathematica to keep track of the activity of neuronal nets.43
In the same letter, McCulloch suggested that it was only around this time that he started seeing his theory
of the brain as a “theory of knowledge”:
[T]he theory … began originally as a mere calculus for keeping track of observed realities. It was
at work for seven years before it dawned on me that it had those logical implications which
became apparent when one introduces them into the grandest of all feed-back systems, which
runs from the scientist by manipulations through the objects of this world, back to the scientist—
so producing in him what we call theories and in the great world are little artifacts.44
McCulloch had known Northrop, another member of Yale’s Philosophy Department, since 1923,
and continued to be in contact with him through the 1930s. Much of Northrop’s philosophy was about
science and scientific methodology. Northrop believed that scientific disciplines reach maturity when
they start employing logic and mathematics in formulating rigorous, axiomatic theories:
The history of science shows that any empirical science in its normal healthy development begins
with a more purely inductive emphasis, in which the empirical data of its subject matter are
systematically gathered, and then comes to maturity with deductively-formulated theory in which
formal logic and mathematics play a most significant part (Northrop 1940, p. 128; cited by
Abraham 2002, p. 6).
Northrop argued that biology was finally reaching its maturity with the work of Woodger (1937) and
Nicolas Rashevsky (1938), who had imported formalisms and techniques from mathematical physics into
biology.45
While McCulloch was working in Dusser de Barenne’s lab at Yale, Alan Turing published his
famous paper on computability (1936-7), where he drew a clear and rigorous connection between
computing, logic, and machines. By the early 1940s, McCulloch had read Turing’s paper. In 1948, in a
public discussion of his theory of the brain at the Hixon Symposium, McCulloch declared that it was the
reading of Turing’s paper that led him in the “right direction.”46
43 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.
44 Ibid.
45 For a more detailed account of Northrop’s philosophy of science, see Abraham 2002, pp. 6-7.
46 Von Neumann 1951, p. 33.
29
3 TOWARDS A THEORY OF THE BRAIN, 1936-1942
3.1 What Computing Mechanisms Can Do1
The modern mathematical notion of computation, which was developed by Alan Turing in his 1936-7
paper and reviewed in Chapter 1, played a crucial role in the history of computationalism. This section
concerns Turing’s use of “computable” and “machine” in his logic papers, his version of the Church-
Turing Thesis (CT, i.e. that every effectively calculable function is computable by a Turing Machine),
and why his early work on computability was not initially read as an attempt to establish, or even to
imply, that the mind is a machine.
Today, both the term “computable” and formulations of CT are utilized in many contexts,
including discussions of the nature of mental, neural, or physical processes. Some of these uses are
discussed at length in the Chapter 7.2 None of these uses existed at Turing’s time, and their superposition
onto Turing’s words yields untenable results. For instance, according to a popular view, Turing’s
argument for CT was already addressing the problem of how to mechanize the human mind, while the
strength of CTperhaps after some years of experience with computing machineseventually convinced
Turing that thinking could be reproduced by a computer.3
This reading makes Turing appear incoherent. It conflicts with the fact that he, who reiterated CT
every time he talked about machine intelligence, never said that the mechanizability of the mind was a
consequence of CT. Quite the opposite: in defending his view that machines could think, he felt the need
1 This section is adapted from a section of larger paper, devoted to Turing’s ideas on logical proofs and machine
intelligence (Piccinini 2003a).
2 For a survey of different uses see Odifreddi 1989, I.8. (Odifreddi writes “recursive” instead of “computable.”)
3 See e.g. Hodges 1983, esp. p. 108; also Hodges 1988, 1997; Leiber 1991, pp. 57, 100; Shanker 1995, pp. 64, 73;
Webb 1980, p. 220. Turing himself is alleged to have argued, in his 1947 “Lecture to the London Mathematical
Society,” that “the Mechanist Thesis ... is in fact entailed by his 1936 development of CT” (Shanker 1987, pp. 615,
625). Since Shanker neither says what the Mechanist Thesis is, nor provides textual evidence from Turing’s lecture,
it is difficult to evaluate his claim. If the Mechanist Thesis holds that the mind is a machine or can be reproduced by
a machine, we’ll see that Shanker is mistaken. However, some authorsother than Turingdo believe CT to entail
that the human mind is mechanizable (e.g., Dennett 1978a, p. 83, Webb 1980, p. 9). Their view is discussed in the
Chapter 7.
30
to respond to many objections.4 If one wants to understand the development of Turing’s ideas on
mechanical intelligence, his logical work on computability must be understood within its context. The
context would change in the 1940s with the publication of McCulloch and Pitts’s computational theory of
the brain—which will be discussed in the next chapter—and the subsequent rise of computationalism.
This change in context explains why in the second half of the 20th century, many found it so natural to
read Turing’s logical work as defending a form of computationalism.
But in the 1930s there were no working digital computers, nor was cognitive science on the
horizon. There did exist some quite sophisticated computing machines, which at the time were called
differential analyzers and would later be called analog computers. Differential analyzers had mechanical
gears that obeyed certain types of differential equations. By setting up the gears in appropriate ways,
differential analyzers could solve certain systems of differential equations. At least as early as 1937,
Turing knew about the Manchester differential analyzer, which was devoted to the prediction of tides, and
planned to use a version of it to find values of the Riemann zeta function.5
In the 1930s and up through the 1940s, the term “computer” was used to refer to people
reckoning with paper, pencil, and perhaps a mechanical calculator. Given the need for laborious
calculations in industry and government, skilled individuals, usually young women, were hired as
“computers.” In this context, a “computation” was something done by a computing human.
The origins of “Computable Numbers” can be traced to 1935, when Turing graduated in
mathematics from King’s College, Cambridge, and became a fellow of King’s. In that year, he attended
an advanced course on Foundations of Mathematics by topologist Max Newman. Newman, who became
Turing’s lifelong colleague, collaborator, and good friend, witnessed the development of Turing’s work
4 Indeed, in his most famous paper on machine intelligence, Turing admitted: “I have no very convincing arguments
of a positive nature to support my views. If I had I should not have taken such pains to point out the fallacies in
contrary views” (Turing 1950, p. 454).
5 Hodges 1983, pp. 141, 155-8.
31
on computability, shared his interest in the foundations of mathematics, and read and commented on
Turing’s typescript before anyone else.6
In his biography of Turing as a Fellow of the Royal Society, Newman links “Computable
Numbers” to the attempt to prove rigorously that the decision problem for first order logic, formulated by
David Hilbert within his program of formalizing mathematical reasoning (Hilbert and Ackermann 1928),
is unsolvable in an absolute sense. “[T]he breaking down of the Hilbert programme,” said Newman, was
“the application [Turing] had principally in mind.”7 In order to show that there is no effective
procedureor “decision process”solving the decision problem, Turing needed:
… to give a definition of ‘decision process’ sufficiently exact to form the basis of a mathematical
proof of impossibility. To the question ‘What is a “mechanical” process?’ Turing returned the
characteristic answer ‘Something that can be done by a machine,’ and embarked in the highly
congenial task of analyzing the general notion of a computing machine.8
Turing was trying to give a precise and adequate definition of the intuitive notion of effective procedure,
as mathematicians understood it, in order to show that no effective procedure could decide first order
logical provability. When he talked about computations, Turing meant sequences of operations on
symbols (mathematical or logical), performed either by humans or by mechanical devices according to a
finite number of ruleswhich required no intuition or invention or guessworkand whose execution
always produced the correct solution.9 For Turing, the term “computation” by no means referred to all
that mathematicians, human minds, or machines could do.
6 Hodges 1983, pp. 90-110.
7 Newman 1955, p. 258.
8 Ibid.
9 See his argument for the adequacy of his definition of computation in Turing, 1936-7, pp. 135-8. The last
qualificationabout the computation being guaranteed to generate the correct solutionwas dropped after
“Computable Numbers.” In different writings, ranging from technical papers to popular expositions, Turing used
many different terms to explicate the intuitive concept of effective procedure: “computable” as “calculable by finite
means” (1936-7), “effectively calculable” (1936-7, pp. 117, 148; 1937, p. 153), “effectively calculable” as a
function whose “values can be found by a purely mechanical process” (1939, p. 160), “problems which can be
solved by human clerical labour, working to fixed rules, and without understanding” (1945, pp. 38-9), “machine
processes and rule of thumb processes are synonymous” (1947, p. 112), “‘rule of thumb’ or ‘purely mechanical’”
(1948, p. 7), “definite rule of thumb process which could have been done by a human operator working in a
disciplined but unintelligent manner” (1951, p. 1), “calculation” to be done according to instructions explained
“quite unambiguously in English, with the aid of mathematical symbols if required” (1953, p. 289).
32
Turing rigorously defined “effectively calculable” with his famous machines: a procedure was
effective if and only if a Turing Machine could carry it out. “Machine” requires a gloss. Given the task
of “Computable Numbers,” viz. establishing a limitation to what could be achieved in mathematics by
effective methods of proof, it is clear that Turing Machines represented (at the least) computational
abilities of human beings. As a matter of fact, the steps these machines carried out were determined by a
list of instructions, which must be understandable unambiguously by human beings.
But Turing’s machines were not portrayed as understanding instructionslet alone intelligent.
Even if they were anthropomorphically described as “scanning” the tape, “seeing symbols,” having
“memory” or “mental states,” etc., Turing introduced all these terms in quotation marks, presumably to
underline their metaphorical use.10 If one thinks that carrying out a genuine, “meaningful”
computationas opposed to a “meaningless” physical processpresupposes understanding the
instructions, one should conclude that only humans carry out genuine computations. Turing Machines, in
so far as they computed, were abstract representations of idealized human beings. These considerations,
among others, led some authors to a restrictive interpretation: Turing’s theory bears on computability by
humans not by machines, and Turing Machines are “humans who calculate.”11
This interpretation is at odds with Turing’s use of “computation” and “machine,” and with his
depiction of his work. Turing never said his machines should be regarded as idealized human
beingsnor anything similar. We saw that, for him, a computation was a type of physical manipulation
of symbols. His machines were introduced to define rigorously this process of manipulation for
mathematical purposes. As Turing used the term, machines were idealized mechanical devices; they
10 Turing 1936-7, pp. 117-8.
11 This widely cited phrase is in Wittgenstein 1980, sec. 1096. Wittgenstein knew Turing, who in 1939 attended
Wittgenstein’s course on Foundations of Mathematics. Wittgenstein’s lectures, including his dialogues with Turing,
are in Wittgenstein 1976. Discussions of their different points of views can be found in Shanker 1987; Proudfoot
and Copeland 1994. Gandy is more explicit than Wittgenstein: “Turing’s analysis makes no reference whatsoever to
calculating machines. Turing machines appear as a result, as a codification, of his analysis of calculations by
humans” (Gandy 1988, p. 83-4). Sieg quotes and endorses Gandy’s statement (Sieg 1994, p. 92; see also Sieg 1997,
p. 171). Along similar lines is Copeland, 2000, pp. 10ff. According to Gandy and Sieg, “computability by a
machine” is first explicitly analyzed in Gandy, 1980. In the present chapter I am only concerned with the
historiographical merits of the Gandy-Sieg view, and not with its philosophical justification. The latter issue is
addressed in Chapter 7.
33
could be studied mathematically because their behavior was precisely defined in terms of discrete,
effective steps.
There is evidence that Turing, in 1935, talked about building a physical realization of his
universal machine.12 Twelve years later, to an audience of mathematicians, he cited “Computable
Numbers” as containing a universal digital computer’s design and the theory establishing the limitations
of the new computing machines:
Some years ago I was researching on what might now be described as an investigation of the
theoretical possibilities and limitations of digital computing machines. I considered a type of
machine which had a central mechanism, and an infinite memory which was contained on an
infinite tape. This type of machine appeared to be sufficiently general. One of my conclusions
was that the idea of a ‘rule of thumb’ process and a ‘machine process’ were synonymous …
Machines such as the ACE [Automatic Computing Engine] may be regarded as practical versions
of this same type of machine.13
Therefore, a machine, when Turing talked about logic, was not (only) a mathematical
representation of a computing human, but literally an idealized mechanical device, which had a
12 Newman 1954; Turing 1959, p. 49. Moreover, in 1936 Turing wrote a précis of “Computable Numbers” for the
French Comptes Rendues, containing a succinct description of his theory. The definition of “computable” is given
directly in terms of machines, and the main result is appropriately stated in terms of machines:
On peut appeler ‘computable’ les nombres dont les décimales se laissent écrire par une machine . . . On peut
démontrer qu’il n’y a aucun procédé général pour décider si une machine M n’écrit jamais le symbole 0
(Turing, 1936).
The quote translates as follows: We call “computable” the numbers whose decimals can be written by a machine …
We demonstrate that there is no general procedure for deciding whether a machine M will never write the symbol 0.
Human beings are not mentioned.
13 Turing 1947, pp. 106-7. See also ibid., p. 93. Also, Turing machines “are chiefly of interest when we wish to
consider what a machine could in principle be designed to do” (Turing 1948, p. 6). In this latter paper, far from
describing Turing machines as being humans who calculate, Turing described human beings as being universal
digital computers:
It is possible to produce the effect of a computing machine by writing down a set of rules of procedure and
asking a man to carry them out. Such a combination of a man with written instructions will be called a ‘Paper
Machine.’ A man provided with paper, pencil, and rubber, and subject to strict discipline, is in effect a
universal machine (Turing 1948, p. 9).
Before the actual construction of the ACE, “paper machines” were the only universal machines available, and were
used to test instruction tables designed for the ACE (Hodges 1983, chapt. 6). Finally:
A digital computer is a universal machine in the sense that it can be made to replace . . . any rival design of
calculating machine, that is to say any machine into which one can feed data and which will later print out
results (Turing, ‘Can digital computers think?’ Typescript of talk broadcast in BBC Third Programme 15 May
1951, AMT B.5, Contemporary Scientific Archives Centre, King’s College Library, Cambridge, p. 2).
Here, Turing formulated CT with respect to all calculating machines, without distinguishing between analog and
digital computers. This fits well with other remarks by Turing, which assert that any function computable by analog
machines could also be computed by digital machines (Turing 1950, pp. 451-2). And it strongly suggests that, for
him, any device mathematically defined as giving the values of a non-computable function, that is, a function no
Turing machine could computelike the “oracle” in Turing 1939, pp. 166-7could not be physically constructed.
34
potentially infinite tape and never broke down. Furthermore, he thought his machines could compute any
function computable by machines. This is not to say that, for Turing, every physical system was a
computing machine or could be mimicked by computing machines. The outcome of a random process,
for instance, could not be replicated by any Turing Machine, but only by a machine containing a “random
element.”14
Such was the scope of CT, the thesis that the numbers computable by Turing Machines “include
all numbers which could naturally be regarded as computable.”15 To establish CT, Turing compared “a
man in the process of computing . . . to a machine.”16 He based his argument on limitations affecting
human memory and perception during the process of calculation. At the beginning of “Computable
Numbers,” one reads that “the justification [for CT] lies in the fact that the human memory is necessarily
limited.”17 In the argument, Turing used human sensory limitations to justify his restriction to a finite
number of primitive symbols, as well as human memory limitations to justify his restriction to a finite
number of “states of mind.”18
Turing’s contention was that the operations of a Turing machine “include all those which are used
in the computation of a number” by a human being.19 Since the notion of the human process of
computing, like the notion of effectively calculable, was an intuitive one, Turing asserted that “all
arguments which can be given [for CT] are bound to be, fundamentally, appeals to intuition, and for this
reason rather unsatisfactory mathematically.”20 In other words, CT was not a mathematical theorem.21
From “Computable Numbers,” Turing extracted the moral that effective procedures, “rule of
thumb processes,” or instructions explained “quite unambiguously in English,” could be carried out by his
machines. This applied not only to procedures operating on mathematical symbols, but to any symbolic
14 Turing 1948, p. 9; 1950, p. 438.
15 Turing 1936-7, p. 116.
16 Ibid., p. 117.
17 Ibid., p. 117.
18 Turing 1936-7, pp. 135-6.
19 Ibid., p. 118.
20 Ibid., p. 135.
21 CT is usually regarded as an unprovable thesis for which there is compelling evidence. The issue of the
provability of CT is discussed at length in Chapter 7.
35
procedure so long as it was effective. A universal machine, if provided with the appropriate instructions,
could carry out all such processes. This was a powerful thesis, but very different from the thesis that
“thinking is an effective procedure.”22 In “Computable Numbers” Turing did not argue, nor did he have
reasons to imply from CT, that all operations of the human mind could be performed by a Turing
Machine.
After Turing published his paper, a number of results were quickly established to connect his
work to other proposed formal definitions of the effectively calculable functions, such as general
recursiveness (Gödel 1934) and λ-definability (Church 1932, Kleene 1935). It was shown that all these
notions were extensionally equivalent in the sense that any function that fell under any one of these
formal notions fell under all of them. Mathematicians took this as further evidence that the informal
notion of effective calculability had been captured.
In the 1930s and 1940s, Turing’s professionally closest colleagues read his paper as providing a
general theory of computability, establishing what could and could not be computednot only by
humans, but also by mechanical devices. This is how McCulloch and John von Neumann, among others,
read Turing’s work.23
At the time he published “Computable Numbers,” Turing moved to Princeton to work on a Ph.D.
dissertation under Alonzo Church. At Princeton, Turing met von Neumann, who was a member of the
Institute for Advanced Studies. Von Neumann was a Hungarian-born mathematician who had worked in
logic and foundations of mathematics but was then working mostly in mathematical physics. In 1938,
von Neumann invited Turing to become his assistant. Turing declined and went back to England in 1939.
22 According to Shanker, this was Turing’s “basic idea” (1995, p. 55). But Turing never made such a claim.
23 E.g., see Church 1937; 1956, p. 52, n.119; Watson 1938, p. 448ff; Newman 1955, p. 258; Kleene said: “Turing’s
formulation comprises the functions computable by machines” (1938, p. 150). When von Neumann placed
“Computable Numbers” at the foundations of the theory of finite automata, he introduced the problem addressed by
Turing as that of giving “a general definition of what is meant by a computing automaton” (von Neumann 1951, p.
313). Most logic textbooks introduce Turing machines without qualifying “machine,” the way Turing did. More
recently, doubts have been raised about the generality of Turing’s analysis of computability by machines (e.g., by
Siegelmann 1999). These doubts are discussed in Chapter 7.
36
3.2 Teleological Mechanisms
When he met Turing, von Neumann already knew Norbert Wiener. Wiener was an MIT mathematician
who shared von Neumann’s background in logic and current mathematical interests. Wiener was also
trained in philosophy, in which he had published several papers in the 1910s.24 Wiener and von Neumann
met around 1933 and became friends, beginning a scientific dialogue that lasted many years.25 Von
Neumann sometimes referred to their meetings as “mathematical conversations” to which he looked
forward.26
Wiener was interested in designing and building computing mechanisms to help solve problems
in mathematical physics. His correspondence about mechanical computing with Vannevar Bush, the
main designer of the differential analyzer and other analog computing mechanisms, stretches as far back
as 1925.27 At least since the mid 1930s, Wiener became involved in the design of analog computing
devices. According to Bush, Wiener made the original suggestion for the design of an analog computing
mechanism called the optical integraph, and was an expert on analog computing in general.28 In 1935,
Wiener proposed to Bush “a set up of an electrical simultaneous machine,” which was another analog
machine. Wiener briefly described the machine and its proposed use. In the same letter, he also
commented on a paper by Bush on some other analog machine, which might have been the differential
analyzer.29
In 1940, Bush was Chairman of the National Defense Research Committee of the Council of
National Defense. World War II had started and the US would soon enter it. The National Defense
Research Committee was in charge of recruiting talented scientists and assigning them to “national
defense” projects. Wiener proposed Bush a new design for a computing machine for solving boundary
24 Reprinted in Wiener 1976.
25 Letter by von Neumann to Wiener, dated November 26, 1933. Norbert Wiener Papers, Box 2, ff. 38. Letter by
von Neumann to Wiener, dated March 9, 1953. Norbert Wiener Papers, Box 11, ff. 166. For a comprehensive study
of the relationship between Wiener and von Neumann, see Heims 1980.
26 Letter by von Neumann to Wiener, dated April 9, 1937. Norbert Wiener Papers, Box 3, ff. 47.
27 Letter by Wiener to Bush, dated August 18, 1925. Norbert Wiener Papers, Box 2, ff. 27.
28 Letter by Bush to Ku, dated May 26, 1935. Norbert Wiener Papers, Box 3, ff. 42.
29 Letter by Wiener to Bush, dated September 22, 1935. Norbert Wiener Papers, Box 3, ff. 43.
37
value problems in partial differential equations. Unlike analog computing machines, where the values of
differential equations were represented by continuously varying physical variables, the method proposed
by Wiener consisted in replacing a differential equation with a difference equation asymptotically
equivalent to it, and using the difference equation to generate successive representations of the states of
the system using the “binary system” (i.e., binary notation).30 Wiener’s proposed machine was thus of a
kind that a few years later would be called digital rather than analog.
Bush responded to Wiener’s proposal with interest and asked for clarifications on the design and
range of applicability of the machine.31 Wiener convinced Bush that his proposed machine would be
“valuable” and “could be successfully constructed,” and Bush seriously considered whether to assign
funds for its construction.32 He ultimately declined to fund the project, because Wiener’s machine would
yield mostly long-term advantages to national defense, and the researchers who were qualified to work on
developing Wiener’s machine were needed for more urgent “defense research matters.”33
Wiener also had a long-standing interest in biology. In the late 1930s he met Arturo Rosenblueth
and started discussing with him the possibility of formulating a mathematical characterization of the
possible behaviors of organisms, analogously to how engineers described the possible behaviors of
machines. Wiener explained his project in a letter to an old acquaintance, the British biologist J. B. S.
Haldane:
I am writing to you … [about] some biological work which I am carrying out together with
Arturo Rosenblueth… Fundamentally the matter is this: Behaviorism as we all know is an
established method of biological and psychological study but I have nowhere seen an adequate
attempt to analyze the intrinsic possibilities of types of behavior. This has become necessary to
me in connection with the design of apparatus to accomplish specific purposes in the way of the
repetition and modification of time patterns. … [T]he problem of examining the behavior of an
instrument from this point of view is fundamental in communication engineering and in related
fields where we often have to specify what the apparatus between four terminals in a box is to do
for we take up the actual constitution of the apparatus in the box. We have found this method of
examining possibilities of time pattern behavior, quite independently of their realization, a most
useful way of preparing for their later realization. Now this has left us with a large amount of
30 Letters by Wiener to Bush, dated September 20 and 23, 1940. Norbert Wiener Papers, Box 4, ff. 58.
31 Letters by Bush to Wiener, dated September 24 and 25, 1940. Norbert Wiener Papers, Box 4, ff. 58.
32 Letters by Bush to Wiener, dated October 7 and 19, and December 31, 1940. Norbert Wiener Papers, Box 4, ff
58.
33 Letter by Bush to Wiener, dated December 31, 1940. Norbert Wiener Papers, Box 4, ff 58.
38
information on a priori possible types of behavior which we find most useful in discussing the
normal action and the disorders of the nervous system.34
Perhaps because McCulloch was friends with Rosenblueth, McCulloch found out about Wiener’s
work on control engineering and his interest in biology. Rosenblueth arranged a meeting between
McCulloch and Wiener, which according to McCulloch occurred during the spring of 1940 or 1941. As
McCulloch recalled it, he was:
… amazed at Norbert [Wiener]’s exact knowledge, pointed questions and clear thinking in
neurophysiology. He talked also about various kinds of computation and was happy with my
notion of brains as, to a first guess, digital computers, with the possibility that it was the temporal
succession of impulses that might constitute the signal proper.35
In using the term “digital computers,” here McCulloch was being a bit anachronistic. In 1941 there were
no modern digital computers in operation, and the term “digital computer” had not been used yet.
Nevertheless, it is likely that McCulloch explained to Wiener his ideas about information flow through
ranks of neurons in accordance with Boolean algebra.36
Due to his expertise in computing mechanisms, in 1941 Wiener was appointed as a consultant to
the government on machine computation. He began a strenuous research program on fire control for
antiaircraft artillery with a young Research Associate at MIT, Julian Bigelow.37 Bigelow had graduated
in electrical engineering from MIT in 1939, and then worked as an electronics engineer at IBM’s Endicott
Laboratories in 1939-1941. Wiener developed a mathematical theory for predicting the curved flight
paths of aircraft, and Bigelow designed and built a machine that performed the necessary computations.38
Wiener and Bigelow’s work on mechanical prediction was classified, but they were allowed to
communicate with other researchers interested in mechanical computation. In October 1941, Wiener and
34 Letter by Wiener to Haldane, dated June 22 1942. Norbert Wiener Papers, Box 4 ff 62.
35 McCulloch 1974, pp. 38-39.
36 This may be the source of Wiener’s early comparison between digital computing mechanisms and brains, which
Wiener attributes to himself—without crediting McCulloch—in Wiener 1958. If Wiener’s meeting with McCulloch
occurred before September 1940, and if Wiener is correct in attributing his September 1940 proposal of a digital
computer to an attempt to imitate the brain’s method of calculation, then McCulloch’s theory partially inspired one
of the first designs of a digital computer. Cf. Aspray 1985, p. 125.
37 Letter by Wiener to J. Robert Kline, dated April 10, 1941. Norbert Wiener Papers, Box 4, ff. 59.
38 Julian Bigelow’s Vita, undated, Papers of John von Neumann, Box 2, ff. 7.
39
Bigelow received a visit by John Atanasoff, the designer of the ABC computer.39 The ABC computer,
whose construction began in January 1941, was completed in May 1942 and was the first machine to
perform computations using electronic equipment.40 One day in 1942, Wiener and Bigelow “had a lively
discussion on the stairs in the math building at Harvard” with Howard Aiken, another leader in
mechanical computing.41 Aiken was working on a giant electro-mechanical digital computer, the Harvard
Mark 1, which was completed in 1944.
According to Wiener’s recollection, Bigelow convinced Wiener of the importance of feedback in
control mechanisms like those they were designing, and that all that mattered to automatic control of
mechanisms was not any particular physical quantity (such as energy or length or voltage) but only
“information” (used in an intuitive sense), conveyed by any means.42 Wiener seized on the importance of
feedback and information in control mechanisms, and turned it into a cornerstone of his thinking about
the most complex of all control mechanisms—the nervous system.
Wiener merged his work with Bigelow on feedback and information in control mechanisms with
the project on possible behaviors that he was carrying out with Rosenblueth. Rosenblueth, Wiener, and
Bigelow jointly wrote a famous paper on “teleological” mechanisms (Rosenblueth, Wiener, and Biegelow
1943). The paper classified behaviors as follows. First, it distinguished between purposeful (goal-
seeking) and non-purposeful behavior. Then, it divided purposeful behavior into teleological (involving
negative feedback) and non-teleological behavior. Teleological behavior, in turn, was divided into
predictive and non-predictive behavior. The authors argued that by taxonomizing behaviors in this way,
it was possible to study organisms and machines in the same behavioristic way, i.e. by looking at the
correlation between their inputs and outputs without worrying about their internal structure. These ideas,
and especially the role played by feedback in accounting for teleological behavior, would soon attract a
lot of attention. Their impact started to be felt before the paper was published.
39 Letter by Warren Weaver to Atanasoff, dated October 15, 1941. Norbert Wiener Papers, Box 4, ff. 61
40 On Atanasoff’s computer and its relation to later electronic computers, see Burks 2002.
41 Letter by Bigelow to Wiener, dated 7 August 1944, Norbert Wiener Papers, Box 4 ff. 66.
42 Wiener 1948, Introduction; see also McCulloch 1974, p. 39.
40
In April 1942, McCulloch was invited to a conference on Cerebral Inhibition, sponsored by the
Josiah Macy, Jr. Foundation.43 The meeting was organized by Frank Fremont-Smith, M.D., who was
Director of the Foundation’s Medical Division. The discussion at the meeting was to focus on
conditioned reflexes in animals and hypnotic phenomena in humans, both of which were believed to be
related to cerebral inhibition. “It is hoped that by focussing the discussion upon physiological
mechanisms underlying the two groups of phenomena,” wrote Fremont-Smith, “gaps in our knowledge,
as well as correlations, may be more clearly indicated.”44 The format of the conference, which took place
in New York City on May 14th and 15th 1942, included two formal presentations on conditioned reflexes
followed by two days of informal discussion among the participants. One of those participants was
Rosenblueth. During the meeting, Rosenblueth talked about the ideas he had developed with Wiener and
Bigelow about teleological mechanisms. According to McCulloch, Rosenblueth “implanted the feedback
explanation of purposive acts in [Fremont-Smith]’s mind.”45
Fremont-Smith hoped to get the same group to meet again,46 and McCulloch saw this as an
opportunity to foster his ideas about mind and brain.47 As soon as McCulloch found the time, he wrote a
long letter to Fremont-Smith, in which he indicated:
… what I hope we can chew over at great length when we are next together, for they are points
which I feel are very important to the understanding of the problems confronting us; and I feel
sure that your procedure in keeping the group together, discussing long enough to get through the
words to the ideas, is the most profitable form of scientific investigation of such problems.
In his letter, McCulloch outlined his views about neural explanations of mental phenomena, including his
use of symbolic logic to model neural activity, and endorsed Rosenblueth’s point about “the dependence
of ‘goal directed’ behavior upon ‘feed-back’ mechanisms.”48
McCulloch’s connection with Fremont-Smith soon turned into a friendship.49 Their relationship
would bear fruits in a few years, under the guise of several grants by the Macy Foundations to
43 Letter by Fremont-Smith to McCulloch, dated April 27, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.
44 Memorandum by Fremont-Smith, dated May 11, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.
45 Letter by McCulloch to Rosenblueth, dated February 14, 1946. Warren S. McCulloch Papers, ff. Rosenblueth.
46 Letter by Fremont-Smith to McCulloch, dated April 27, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.
47 McCulloch 1974, p. 39.
48 Letter by McCulloch to Fremont-Smith, dated June 24, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.
41
McCulloch’s lab as well as what came to be known as the Macy Meetings on cybernetics. By 1942,
though, McCulloch was about to publish his long-in-the-work theory of the brain, with help from a new
and important character.
3.3 Walter Pitts
Walter Pitts fled his parental home around the age of 15, and never spoke to his family again.50 In
1938—at the age of 15—he attended a lecture by Bertrand Russell at the University of Chicago. During
the lecture, he met an eighteen-year-old fellow in the audience, Jerome Lettvin, who was preparing for
medical school by studying biology at the University of Chicago. Pitts and Lettvin became best friends.51
According to Lettvin, by the time he met Lettvin, Pitts “had, for a long time, been convinced that
the only way of understanding nature was by logic and logic alone.”52 Here is Lettvin’s recollection of
the origin of that view, as well as a poignant description of Pitts’s unique personality:
Pitts was married to abstract thought. Once, Pitts told us that when he was twelve years old he
was chased by some bullies into a public library, where he hid in the stacks. There he picked up
Russell and Whitehead’s Principia Mathematica and could not put it down. For the next week he
lived in the library from opening to closing time, going through all three volumes. It seemed to
him then that logic was magic, and if he could master that magic and practice it, the whole world
would be in his hegemony—he would be Merlin. But to do this one had to do away with self.
Ego must never enter, but only Reason. And at that moment of revelation he committed
ontological suicide. That is the peculiar truth about Pitts, whom all of us loved and protected.
We never knew anything about his family or his feelings about us. He died mysterious, sad and
remote, and not once did I find out, or even want to find out more about how he felt or what he
hoped. To be interested in him as a person was to lose him as a friend.53
Other witnesses concurred with Lettvin’s assessment. People who knew him personally described Pitts as
shy, introverted, and socially awkward.54
49 At least by August 1943, the two started addressing their letters “Dear Warren” and “Dear Frank” rather than
“Dear Doctor McCulloch” and “Dear Doctor Fremont-Smith.” Warren S. McCulloch Papers, ff. Fremont-Smith.
50 Lettvin 1989b, p. 514.
51 Heims 1991, p. 40; Smalheiser 2000, p. 219. Letter by Lettvin to Wiener, dated ca. April, 1946. Norbert Wiener
Papers, Box 4, ff. 70. Accounts of Pitts’s life contain fictionalized stories, apparently propagated by McCulloch.
Smalheiser gives a nice summary of Pitt’s life, work and personality. The most reliable source on Pitts seems to be
his “life-long friend” Lettvin (Lettvin 1989a, 1989b)
52 Lettvin 1989a, p. 12.
53 Lettvin 1989b, p. 515.
54 Smalheiser 2000, p. 220-221.
42
Nevertheless, there is consensus that Pitts became knowledgeable and brilliant. The “magic” he
performed at twelve with the Principia Mathematica may have worked, because Lettvin continued his
story as follows:
[I]f a question were asked about anything whatever—history, literature, mathematics, language,
any subject at all, even empirics such as systematic botany or anatomy, out would come an
astonishing torrent, not of disconnected bits and pieces of knowledge, but an integral whole, a
corpus, an organized handbook with footnotes and index. He was the very embodiment of mind,
and could out-think and out-analyze all the rest of us.55
In the late 1930s, Pitts started auditing classes at the University of Chicago, without enrolling as a
student. He studied logic with Carnap and biophysics with Nicolas Rashevsky.56 Rashevsky was a
Russian physicist who had established the Committee on Mathematical Biology, a pioneering research
group in biophysics that included Frank Offner, Herbert Landahl, and Alston Householder. Rashevsky
advocated the development of mathematical models of idealized biological processes, applying to biology
the methodology of theoretical physics (Rashevsky 1936, 1937, 1938).57 One area on which Rashevsky
and his group worked was the nervous system.
Pitts became a member of Rashevsky’s group, quickly starting to do original research without
ever earning a degree. In the early 1940s, Pitts published several papers on neural networks in
Rashevsky’s journal, the Bulletin of Mathematical Biophysics (Pitts 1942a, 1942b, 1943). According to
Lettvin, it was during this time, namely before meeting McCulloch, that Pitts developed the view that the
brain is a “logical machine.”58
55 Lettvin 1989b, p. 515.
56 Glymour notes that Carnap’s role in the history of computationalism includes his teaching both Pitts and another
pioneer, Herbert Simon (Glymour 1990). To them, I should add another student of Carnap that was going to play an
important role in the history of computationalism, namely Ray Solomonoff. Unfortunately, a search through the
Rudolf Carnap Collection at the Archives of Scientific Philosophy, University of Pittsburgh, uncovered no
information on Carnap’s relationship to Pitts, Simon, or Solomonoff. Aside from Carnap’s teaching these
individuals (and inspiring Solomonoff’s early work), I have found no evidence of a significant impact by Carnap or
his work on the main founders of computationalism.
57 On Rashevky and his group, see Abraham 2001b; Abraham 2002, pp. 13-18.
58 Lettvin interview, in Anderson and Rosenfeld 1998, p. 3. Lettvin also put it this way (using slightly anachronistic
terminology):
Quite independently, McCulloch and Pitts set about looking at the nervous system itself as a logical
machine in the sense that if, indeed, one could take the firings of a nerve fiber as digital encoding of
information, then the operation of nerve fibers on each other could be looked at in an arithmetical sense as
a computer for combining and transforming sensory information (Lettvin 1989a, p. 10).
43
3.4 McCulloch Meets Pitts
In the fall of 1941, McCulloch moved to the University of Illinois in Chicago. He was hired by the
Department of Psychiatry to build up a team of specialists and study the biological foundations of mental
diseases.59 At the University of Illinois, the lab’s electrical engineer was Craig Goodwin, who knew the
theory and design of control devices. Goodwin introduced McCulloch to this area of research, which
included topics like automatic volume controls and self-tuning devices. According to McCulloch, he
learned from Goodwin that when the mathematics of the hardware, e.g. coupled nonlinear oscillators, was
intractable, i.e. when the equations representing the system could not be solved analytically, they could
still build a working model and use it to think about the problem.60
In 1939, Lettvin started medical school at the University of Illinois, where his anatomy teacher
was Gerhard von Bonin. After McCulloch moved to Chicago in 1941, von Bonin introduced Lettvin to
McCulloch.61 McCulloch, who once called Lettvin “the brightest medical student I have ever known,”
exerted a strong influence on Lettvin, and later convinced him to do research on the brain.62 Once in
Chicago, McCulloch also made contact with Rashevsky’s group. He started attending the group’s
seminar, where Lettvin introduced him to then almost eighteen-year-old Pitts.63 Like Carnap and
Rashevsky before him, McCulloch was “much impressed” by Pitts.64
When McCulloch presented his ideas about information flow through ranks of neurons to
Rashevsky’s seminar, Pitts was in the audience. Pitts showed interest in a problem that McCulloch had
59 McCulloch 1974, p. 35.
60 McCulloch 1974, p. 35. This may be significant in light of McCulloch’s later ideas about building mechanical
models of the brain.
61 Lettvin 1989b, p. 514.
62 Letter by McCulloch to Henry Moe, dated December 30, 1959. Warren S. McCulloch Papers, ff. Gerard. Letter
by Lettvin to Wiener, dated ca. April, 1946. Norbert Wiener Papers, Box 4, ff. 70.
63 Lettvin 1989b, p. 515.
64 McCulloch 1974, pp. 35-36.
44
struggled with: the problem of how to give mathematical treatment to regenerative nervous activity in
closed neural loops.65
McCulloch hypothesized that closed neural loops explained neural processes that, once started,
continued on by themselves. Initially, McCulloch was thinking about pathologies, such as the neural
activity of epileptic patients and other pathological conditions, including phantom limbs, compulsive
behavior, anxiety, and the effects of shock therapy. But Lawrence Kubie had postulated closed loops of
activity to explain memory (Kubie 1930), and Lorente de Nó had shown the significance of closed loops
in vestibular nistagmus (Lorente de Nó 1938). This convinced McCulloch that closed loops of activity
could fulfill positive neural functions. By the time McCulloch met Pitts, McCulloch thought closed loops
could account for memory and conditioning, but he still didn’t know how to think mathematically about
them.66
McCulloch and Pitts started working together; they worked so closely that Pitts (as well as
Lettvin) moved in with McCulloch and his family for about a year in Chicago. McCulloch and Pitts
became intimate friends and they remained so until their death in 1969.67 For two years, they worked
largely on the problem of how to treat closed loops of activity mathematically. According to McCulloch,
the solution was worked out mostly by Pitts using techniques that McCulloch didn’t understand. To build
up their formal theory, they adopted what they saw as Carnap’s rigorous terminology, which Pitts knew
from having studied with Carnap. Thus, according to McCulloch, Pitts did all the difficult technical
work.68 The resulting paper was published in Rashevsky’s journal in 1943, with a brief follow-up written
by McCulloch and Pitts with Herbert Landahl, another member of Rashevsky’s group, on a statistical
application of the theory.
65 McCulloch 1974, pp. 35-36.
66 McCulloch 1974, p. 36.
67 Shortly before both of them died, Pitts wrote McCulloch from his hospital bed, commenting in detail on their
conditions and expressing the wish that they meet again and talk about philosophy. Letter by Pitts to McCulloch,
dated April 21, 1969. Warren S. McCulloch Papers, ff. Pitts.
68 McCulloch 1974, p. 36.
45
4 BRAINS COMPUTE THOUGHTS, 1943
One would assume, I think, that the presence of a theory, however strange, in a field in which no
theory had previously existed, would have been a spur to the imagination of neurobiologists…
But this did not occur at all! The whole field of neurology and neurobiology ignored the
structure, the message, and the form of McCulloch’s and Pitts’s theory. Instead, those who were
inspired by it were those who were destined to become the aficionados of a new venture, now
called Artificial Intelligence, which proposed to realize in a programmatic way the ideas
generated by the theory (Lettvin 1989a, p. 17).
4.1 A Mechanistic Theory of Mind
McCulloch believed that the goal of neurophysiology and psychiatry was to explain the mind, and that
scientists had not seriously tried to construct a neural theory to this effect. A serious obstacle was what
philosophers called the mind-body problem. In a commentary to a paper presented in May 1943 at the
Illinois Psychiatric Society, McCulloch explained:
We have a dichotomy in medicine, which has grown increasingly… Psychiatric approach on one
side, particularly the psychoanalytic approach, has produced one group; the organic approach to
the physiology of particular organs and disease processes has made organicists of another group.
It has grown difficult for us to talk to each other. I am afraid that there is still in the minds of
most of us, and that there probably will be for years, that difficulty which concerned and still
concerns many thinking people—I mean the dichotomy between mind and body.1
McCulloch continued his commentary by saying that there were “two types of terminology”:
“mental terms” were used to describe “psychological processes, for these exhibit ideas and intentions”;
“physical terms” were used to describe “bodily processes, for these exhibit matter and energy.” But:
… it remains our great difficulty that we have not ever managed to conceive how our patient—
our monad—can have a psychological aspect and a physiological aspect so divorced. You may
think that I am exaggerating the difficulty here, but there have appeared within the last few years
two books which tilt at the same windmill. One is Sherrington, called “Man and His Nature,” and
in it Sherrington, the marvelously honest physiologist, attempts to make head and tail of the
mind-body relation, but is frustrated because in that world “Mind goes more ghostly than a
ghost.” The other book, by Wolfgang Koehler (the founder of Gestalt psychology), is entitled
“The Place of Value in a World of Fact,” but in spite of his endless searching, you will be
convinced that he has not found the place of value in the world of fact. Such was the
unsatisfactory state of our theory until very recently.2
1 Discussion by Dr. McCulloch of a paper by Dr. Alexander on Fundamental Concepts of Psychosomatic Research,
Illinois Psychiatric Society, dated May 22, 1943. Warren S. McCulloch Papers.
2 Ibid. The works referenced by McCulloch are Sherrington 1940 and Köhler 1938.
46
After thus stating the mind-body problem, McCulloch pointed at two recent developments that gave hope
for its solution.
As an answer to the question of “the place of values in a world of fact,” McCulloch cited the
newly published work of Rosenblueth, Wiener, and Bigelow (1943), which used the notion of feedback to
account for teleological behavior. As to what McCulloch called the “formal” aspect of mind, he promised
he was going to have something to contribute soon:
At the present time the other mental aspect of behavior—I mean its ideational or rational, formal
or logical aspect—is coming to the fore. This work … should be coming to fruition in the next
year or two… We do resent the existing hiatus between our mental terminology and our physical
terminology. It is being attacked in a very realistic fashion today. So while we do at the moment
think of it as a “leap from psyche to soma,” we are busy bridging the gap between mental
processes and physical processes. To this audience it is interesting that that bridge is being made
by demonstrating that the properties of systems which are like our nervous system necessarily
show those aspects of behavior that make us call it “mental”—namely, ideas and purposes.3
The explanation for the “formal” aspect of the mind, and hence the solution to that component of the
mind-body problem, was about to be offered by McCulloch in the paper he was writing with Walter Pitts.
Their way of solving the problem was to demonstrate how a system of neuron-like elements embodied
ideas.
In a letter by McCulloch to Frank Fremont-Smith, written a few months before the commentary
cited above, McCulloch was more detailed and explicit as to what he hoped to accomplish with his theory
and the role that logic played in it:
As to the “formal” properties [of the mind], it is perfectly possible today (basing the work on the
all-or-none law and the requirement of summation at a synapse and of inhibition either at a
synapse or by preoccupation of a requisite pool of internuncials) to show that neuronal reactions
are related to antecedent neuronal reactions—I mean reactions in parts of the nervous system
afferent to the reaction in question—in a manner best schematized by symbolic logic; in brief,
that the efferent impulses are related to the afferent impulses as logical consequences are related
to logical antecedents, and hence that classes of the latter are so related to classes of the former.
Little consideration is necessary to show that neuronal and all other reactions which
derive their energy metabolically and are triggered off by something else, being reactions of the
zero order with respect to what initiates them, bear to their precipitating causes the same relation
that propositions do to that which they propose. If then, from the sense organ forward, the
reaction of subsequent neurones is dependent upon any selection from the totality of energy
delivered to the system, the response corresponds to an abstraction from that totality, so that
3 Ibid.
47
neural behavior is not only essentially propositional but abstract with respect to its precipitating
cause.4
Once again, McCulloch was describing the work he was pursuing with Pitts. The all-or-none law of
neural activity—namely that the effects of neural activity depended only on the number of nerve impulses
traveling through the nervous system—allowed McCulloch and Pitts to use symbolic logic to describe
neural activity, so that inferential relations among propositions described causal streams of neural events.
This, for McCulloch, was enough to show that “neural behavior is essentially propositional” in a way that
explained mechanistically the “formal” aspect of the mind.
The sense in which neural behavior was essentially propositional was further clarified by
McCulloch in a letter to a neurophysiologist at the University of Chicago, Ralph Lillie. In February 1943,
he explained how “we might be able to see mechanistically the problem of ideas”:
[W]hat was in my mind was this: that neuronal activity bore to the world external to the
organism the relationship that a proposition bears to that to which it proposes. In this sense,
neuronal activity so reflects the external world as to account for that all-or-none characteristic of
our logic (and of our knowledge) which has been one of the greatest stumbling blocks to
epistemology. I think that for the first time we are in a position to regard scientific theory as the
natural consequence of the neuronal activity of an organism (here the scientist)… And this has
come about because the observed regularity—all-or-none of neurones, bears a one-to-one
correspondence to those peculiar hypothetical psychic atoms called psychons which preserve in
the unity of their occurrence both the all-or-none law and the property of reference characteristic
of propositions.5
Thanks to the all-or-none law, neural events stood in “one-to-one correspondence” to psychons, and just
like psychons and propositions, neuronal activity had “the property of reference.”
To solve the mind-body problem, McCulloch and Pitts formulated what they called a “logical
calculus of the ideas immanent in nervous activity” (McCulloch and Pitts 1943). As Frederic Fitch
pointed out in reviewing the paper for the Journal of Symbolic Logic, this was not quite a logical calculus
in the sense employed by logicians.6
A common misconception is that McCulloch and Pitts demonstrated that neural nets can compute
anything that Turing Machines can:
4 Letter by McCulloch to Fremont-Smith, dated June 24, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.
5 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.
6 Fitch 1944.
48
McCulloch and Pitts proved that a sufficiently large number of these simple logical devices,
wired together in an appropriate manner, are capable of universal computation. That is, a
network of such ‘lineal threshold’ units with the appropriate synaptic weights can perform any
computation that a digital computer can, though not as rapidly or as conveniently.7
As we shall see, this is incorrect in two respects. First, McCulloch and Pitts did not prove any results
about what their nets can compute, although they claimed that there were results to prove; second,
McCulloch-Pitts nets—as the McCulloch and Pitts recognized—are computationally less powerful than
Turing Machines.
The computation power of McCulloch-Pitts nets is only one among many issues raised by their
theory. Although this paper is cited often, it has received little careful attention. The great historical
importance of this paper, and its common misrepresentation, warrant that we study it closely. The rest of
this chapter is devoted to it.
4.2 Motivation
The paper started with the rehearsal of some established neurophysiological facts: the nervous system
was a network of neurons connected through synapses; neurons sent to each other excitatory and
inhibitory pulses8; and each neuron had a threshold determining how many excitatory and inhibitory
inputs are necessary and sufficient to excite it at a given time.9
Then, the authors introduced the main premise of their theory: the identification of neuronal
signals with propositions. This was presumably what justified their title, which mentioned a calculus of
ideas immanent in nervous activity. They introduced this identification in a curious and rather obscure
7 Koch and Segev 2000, p. 1171.
8 According to Lettvin, an important source of the logic gate model of the neuron was the recent discovery by David
Lloyd of direct excitation and inhibition between single neurons: “it was not until David Lloyd’s work in 1939-41
that the direct monosynaptic inhibitory and excitatory actions of nervous pulses were demonstrated. This finding,
more than anything else, led Warren and Walter to conceive of single neurons as doing logical operations (a la
Leibnitz and Boole) and acting as gates” (Lettvin’s 1988, Foreward to the second edition of Embodiments of Mind,
cited by Heims 1991, pp. 233-234). In light of McCulloch’s professions of belief in his logical conception of the
nervous system since the early 1930s, it is unclear how crucial Lloyd’s work was in motivating McCulloch and Pitts,
besides providing experimental validation of some of their ideas.
9 McCulloch and Pitts 1943, pp. 19-21.
49
way, appealing not to some explicit motivation but to “considerations,” made by one of the authors,
which they did not give:
Many years ago one of us, by considerations impertinent to this argument, was led to conceive of
the response of any neuron as factually equivalent to a proposition which proposed its adequate
stimulus. He therefore attempted to record the behavior of complicated nets in the notation of the
symbolic logic of propositions. The “all-or-none” law of nervous activity is sufficient to insure
that the activity of any neuron may be represented as a proposition. Physiological relations
existing among nervous activities correspond, of course, to relations among the propositions; and
the utility of the representation depends upon the identity of these relations with those of the logic
of propositions. To each reaction of any neuron there is a corresponding assertion of a simple
proposition. This, in turn, implies either some other simple proposition or the disjunction or the
conjunction, with or without negation, of similar propositions, according to the configuration of
the synapses upon and the threshold of the neuron in question.10
In light of what was said in Chapters 2 and 3, the author of the “considerations” was McCulloch, and the
considerations were those that led him to formulate first his theory of psychons, and then his theory of
information flow through ranks of neurons. A proposition that “proposes a neuron’s adequate stimulus”
was a proposition that said that the neuron received a certain input at a certain time. The authors did not
explain what they meant by “factual equivalence” between neuronal pulses and propositions, but their
language suggested they meant both that neuronal pulses were represented by propositions, and that
neuronal pulses had propositional content.
The theory was divided into two parts: one dealing with nets without closed loops of neural
activity, which in this paper are referred to as “circles,” the other dealing with nets with circles (cyclic
nets). The authors pointed out that the nervous system contains many circular, “regenerative” paths.11
The term “circle” may have been borrowed from Turing (1936-7), who had used “machines with circles”
for Turing Machines whose computations never halt, and “machines without circles” for Turing Machines
whose computations eventually halt.
4.3 Assumptions
In formulating the theory, McCulloch and Pitts made the following five assumptions:
10 Ibid., p. 21; emphasis added.
11 Ibid., p. 22.
50
1. The activity of the neuron is an “all-or-none” process.
2. A certain fixed number of synapses must be excited within the period of latent addition in
order to excite a neuron at any time, and this number is independent of previous activity and
position on the neuron.
3. The only significant delay within the nervous system is synaptic delay.
4. The activity of any inhibitory synapse absolutely prevents excitation of the neuron at that
time.
5. The structure of the net does not change with time.12
These assumptions constituted an idealization of the known properties of neural nets.
Assumption (1) was simply the all-or-none law: neurons were believed to either pulse or be at rest. As to
(2), it is not strictly true but in many cases it was considered a good approximation. As to (3), this was
probably the least explicit and least physiologically justified assumption of the theory. Under the heading
of “synaptic delay,” McCulloch and Pitts assumed that the timing of the activity of neural nets was
uniformly discrete, such that any neural event in a neural net occurred within one time interval of fixed
duration. This assumption had the effect of discretizing the continuous temporal dynamics of the net, so
that logical functions of discrete states could be used to describe the transitions between neural events.
As to (4) and (5), McCulloch and Pitts admitted that they are false of the nervous system. However,
under the other assumptions, they showed that nets that do not satisfy (4) and (5) are functionally
equivalent to nets that do.13
McCulloch and Pitts were perfectly aware that the neuron-like elements in their theory were quite
distant from real neurons: “Formal neurons were deliberately as impoverished as possible.”14 In a letter
12 Ibid., p. 22.
13 Ibid., pp. 29-30.
14 McCulloch 1974, p. 36.
51
written to a colleague asking for clarification after a public presentation of the theory, McCulloch wrote
as follows:
[W]e in our description restricted ourselves to the regular behavior of the nervous system,
knowing full well that irregularities can be and are frequently brought about by physical and
chemical alterations of the nervous system. As a psychiatrist, I am perhaps more interested in
these than in its regular activity, but they lead rather to a theory of error than a theory of
knowledge, and hence were systematically excluded from the description.15
In McCulloch’s eyes, the differences between real neurons and the elements employed in his theory were
inessential. His goal was not to understand neural mechanisms per se, but rather to explain how
something close enough to a neural mechanism could exhibit “knowledge,” the kind of “ideational,”
“rational,” “formal,” or “logical” aspect that was associated with the mind. McCulloch’s goal was to
offer, for the first time, an explanation of the mind in terms of neural-like mechanisms.
4.4 Nets Without Circles
McCulloch and Pitts’s technical language was cumbersome; here their theory is given in a slightly
streamlined form that makes it easier to follow. The neurons of a net N are denoted by c1, c2, … cn. A
primitive expression of the form Ni(t) means that neuron ci fires at time t. Expressions of the form Ni(t)
can be combined by means of logical connectives to form complex expressions that describe the behavior
of different neurons at certain times. For example, N1(t)&N2(t) means that neurons c1 and c2 fire at time t,
N1(t-1)∨N2(t-2) means that either c1 fires at t-1 or c2 fires at t-2 (or both), etc. These complex expressions
can in turn be combined by the same logical connectives. As well-formed combinations, McCulloch and
Pitts allowed only the use of conjunction (A&B), disjunction (A∨B), conjunction and negation (A&~B),
and a special connective S that shifts the temporal index of an expression backwards in time, so that
S(Ni(t)) = Ni(t-1). A complex expression formed from a number of primitive expressions N1(t), … Nn(t)
15 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.
Cf. also Lettvin:
The Logical Calculus, McCulloch knew, was not even a caricature of any existing nervous process. Indeed
he made that very clear at the time of writing. But is [sic] was a possible and useful assembly of
axiomatized neurons, and that seemed to him a far greater accomplishment than a true description of any
definitely known neuronal circuit (of which none then existed) (Lettvin 1989b, p. 518).
52
by means of the above connectives is denoted by Expj(N1(t), … Nn(t)). In any net without circles, there
are some neurons with no axons inputting on them; these are called afferent neurons.
The two main technical problems McCulloch and Pitts wanted to solve were “to calculate the
behavior of any net, and to find a net which will behave in a specified way, when such a net exists.”16 In
terms of the theory, the problems can be formulated as follows:
First problem: given a net, find a class of expressions C such that for every neuron ci, in C there
is a true expression of the form
Ni(t) if and only if Expj((Ni-g(t-1), … Ni-2(t-1), Ni-1(t-1)),
where neurons ci-g, … ci-2, and ci-1 have axons inputting ci.
The significance of this expression is that it describes the behavior of any (non-afferent) neuron in terms
of the behavior of the neurons that are afferent to it. If a class C of such expressions is found,
propositional logic can describe the behavior of any non-afferent neuron in the net in terms of the
behavior of the neurons afferent to it.
Second problem: given an expression of the form
Ni(t) if and only if Expj((Ni-g(t-1), … Ni-2(t-1), Ni-1(t-1)),
find a net for which it is true.
McCulloch and Pitts showed that these problems were easily solved. To solve the first problem, they
showed how to write an expression describing the relation between the firing of any neuron in a net and
the inputs it receives from its afferent neurons. To solve the second problem, they showed how to
construct nets that satisfy their four combinatorial schemes (conjunction, disjunction, conjunction-cum-
negation, and temporal predecessor), giving diagrams that show the connections between neurons that
satisfy each scheme (figure 4-1). Then, by induction on the size of the nets, all expressions formed by
those combinatorial schemes are realizable by McCulloch-Pitts nets.17
16 Ibid., p. 24.
17 Their actual proof was not quite a mathematical induction because they didn’t show how to combine nets of
arbitrary size, but the technical details are unimportant here.
53
Figure 4-1. Diagrams of McCulloch and Pitts nets.
By giving diagrams of nets that satisfy simple logical relations between propositions and by
showing how to combine them to satisfy more complex logical propositions, McCulloch and Pitts
developed a powerful technique for designing circuits that satisfy given logical functions by using a few
primitive building blocks.18
McCulloch and Pitts’s goal was to explain mental phenomena. As an example, they offered an
explanation of a well-known heat illusion by constructing an appropriate net. A cold object touching the
skin normally causes a sensation of cold, but if it is held for a very brief time and then removed, it can
cause a sensation of heat. In designing their net, McCulloch and Pitts reasoned as follows. They started
from the known physiological fact that there are different kinds of receptors affected by heat and cold,
and they assumed that there are neurons whose activity “implies a sensation” of heat.19 Then, they
assigned one neuron to each function: heat reception, cold reception, heat sensation, and cold sensation.
Finally, they observed that the heat illusion corresponded to the following relations between three
18 This is the main aspect of their theory used by von Neumann in describing the design of digital computers (see the
next chapter). Today, McCulloch and Pitts’s technique is part of logic design, an important area of computer design
devoted to designing digital circuits for digital computers. The building blocks of contemporary logic design are
called logic gates. In modern terminology, McCulloch and Pitts’s nets are logic gates and combinations of logic
gates. For more on logic and computer design, see Chapter 10.
19 Ibid., p. 27.
54
neurons: the heat-sensation neuron fires either in response to the heat receptor or to a brief activity of the
cold receptor (figure 4-2).
Figure 4-2. Net explaining heat illusion. Neuron 3 (heat sensation) fires if and only if it receives two
inputs, represented by the lines terminating on its body. This happens when either neuron 1 (heat
reception) fires or neuron 2 (cold sensation) fires once and then immediately stops firing. When
neuron 2 fires twice in a raw, the intermediate (unnumbered) neurons excite neuron 4 rather than
neuron 3, generating a sensation of cold.
McCulloch and Pitts used this example for a general observation about the relation between
perception and the world:
This illusion makes very clear the dependence of the correspondence between perception and the
“external world” upon the specific structural properties of the intervening nervous net.20
Then, they pointed out that, under other assumptions about the behavior of the heat and cold receptors, the
same illusion could be explained by different nets (ibid., p. 28).
4.5 Nets With Circles, Computation, and the Church-Turing Thesis
The problems for nets with circles are analogous to those for nets without circles: given the behavior of a
neuron’s afferents, find a description of the behavior of the neuron; and find the class of expressions and a
method of construction such that for any expression in the class, a net can be constructed that satisfies the
expression. The authors pointed out that the theory of nets with circles is more difficult than the theory of
nets without circles. This is because activity around a circle of neurons can continue for an indefinite
amount of time, hence expressions of the form Ni(t) may have to refer to times that are indefinitely remote
20 Ibid., p. 28.
55
in the past. For this reason, the expressions describing nets with circles are more complicated, involving
quantification over times. McCulloch and Pitts offered solutions to the problems of nets with circles, but
their treatment of this part of the theory was very obscure, admittedly sketchy,21 and contained some
errors that make it hard to follow.22
At the end of this section, McCulloch and Pitts drew the connection between their nets and
computation:
It is easily shown: first, that every net, if furnished with a tape, scanners connected to afferents,
and suitable efferents to perform the necessary motor-operations, can compute only such numbers
as can a Turing machine; second, that each of the latter numbers can be computed by such a net;
and that nets with circles can be computed by such a net; and that nets with circles can compute,
without scanners and a tape, some of the numbers the machine can, but no others, and not all of
them. This is of interest as affording a psychological justification of the Turing definition of
computability and its equivalents, Church’s λ-definability and Kleene’s primitive recursiveness:
If any number can be computed by an organism, it is computable by these definitions, and
conversely.23
This brief passage is the only one mentioning computation. By stating that McCulloch-Pitts nets
compute, this passage provided the first known published link between computation and brain theory. It
was a pivotal statement in the history of computationalism.
It is often said that McCulloch and Pitts proved that their nets could compute anything that Turing
Machines can compute (e.g., Koch and Segev 2000). This misconception was initiated and propagated by
McCulloch himself. For instance, in summarizing the significance of their paper, McCulloch wrote to a
colleague:
[T]he original paper with Pitts entitled “A Logical Calculus of Ideas Immanent in Nervous
Activity” … sets up a calculus of propositions subscripted for the time of their appearance for any
net handling all-or-none signals, and shows that such nets can compute any computable number
or, for that matter, do anything any other net can do by the way of pulling consequences out of
premises.24
21 Ibid., p. 34.
22 Every commentator points this out, starting with Fitch 1944, p. 51. See also Arbib 1989. McCulloch and Pitts’s