Content uploaded by Gualtiero Piccinini

Author content

All content in this area was uploaded by Gualtiero Piccinini on Apr 16, 2016

Content may be subject to copyright.

COMPUTATIONS AND COMPUTERS IN THE SCIENCES OF MIND AND BRAIN

by

Gualtiero Piccinini

BA, Università di Torino, 1994

Submitted to the Graduate Faculty of

University of Pittsburgh in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

University of Pittsburgh

2003

UNIVERSITY OF PITTSBURGH

FACULTY OF ARTS AND SCIENCES

This dissertation was presented

by

Gualtiero Piccinini

It was defended on

June 20th, 2003

and approved by

John Earman, University Professor of History and Philosophy of Science

G. Bard Ermentrout, Professor of Mathematics

Paul Griffiths, Professor of History and Philosophy of Science

John D. Norton, Professor of History and Philosophy of Science

Peter K. Machamer, Professor of History and Philosophy of Science

Dissertation Director

ii

To my parents and sisters

iii

Copyright by Gualtiero Piccinini

2003

Section 3.1 of the present work is an adapted version of section 2 of Gualtiero Piccinini, “Alan Turing and

the Mathematical Objection,” Minds and Machines 13(1), pp. 23-48. Copyright 2003 by Kluwer,

reproduced by permission.

The following archives have given permission to use extended quotations from unpublished work. From

the Warren S. McCulloch Papers, American Philosophical Society Library. Copyright by the American

Philosophical Society Library. From the Norbert Wiener Papers, Institute Archives and Special

Collections, MIT Libraries. Copyright by the MIT Libraries.

iv

COMPUTATIONS AND COMPUTERS IN THE SCIENCES OF MIND AND BRAIN

Gualtiero Piccinini, PhD

University of Pittsburgh, 2003

Computationalism says that brains are computing mechanisms, that is, mechanisms that perform

computations. At present, there is no consensus on how to formulate computationalism precisely or

adjudicate the dispute between computationalism and its foes, or between different versions of

computationalism. An important reason for the current impasse is the lack of a satisfactory philosophical

account of computing mechanisms. The main goal of this dissertation is to offer such an account.

I also believe that the history of computationalism sheds light on the current debate. By tracing

different versions of computationalism to their common historical origin, we can see how the current

divisions originated and understand their motivation. Reconstructing debates over computationalism in

the context of their own intellectual history can contribute to philosophical progress on the relation

between brains and computing mechanisms and help determine how brains and computing mechanisms

are alike, and how they differ. Accordingly, my dissertation is divided into a historical part, which traces

the early history of computationalism up to 1946, and a philosophical part, which offers an account of

computing mechanisms.

The two main ideas developed in this dissertation are that (1) computational states are to be

identified functionally not semantically, and (2) computing mechanisms are to be studied by functional

analysis. The resulting account of computing mechanism, which I call the functional account of

computing mechanisms, can be used to identify computing mechanisms and the functions they compute. I

use the functional account of computing mechanisms to taxonomize computing mechanisms based on

their different computing power, and I use this taxonomy of computing mechanisms to taxonomize

different versions of computationalism based on the functional properties that they ascribe to brains. By

doing so, I begin to tease out empirically testable statements about the functional organization of the brain

v

that different versions of computationalism are committed to. I submit that when computationalism is

reformulated in the more explicit and precise way I propose, the disputes about computationalism can be

adjudicated on the grounds of empirical evidence from neuroscience.

vi

PREFACE

In the 1940s, inspired by the birth and development of modern computers, Alan Turing, Warren

McCulloch, Norbert Wiener, John von Neumann, and many others developed a new theory of the brain,

here called computationalism. Computationalism says that brains are computing mechanisms, that is,

mechanisms that perform computations. Computationalism expands the old idea that reasoning is a form

of computation (from Hobbes to formal logic) into the stronger idea that all cognitive processes or even

all neural processes are a form of computation. In the past fifty years, computationalism has shaped

several fields. There are canonical explications of computationalism in computer science (Newell and

Simon 1976), psychology (Pylyshyn 1984, Rumelhart and McClelland 1986), neuroscience (Churchland,

Koch, and Sejnowski 1990), and philosophy (Fodor 1975).

Computationalists agree that brains are computing mechanisms, but in calling them computing

mechanisms, they mean such radically different things that they often talk across each other. For some,

the brain is functionally organized like the hardware of a desktop computer, on which different programs

can run (e.g., Newell and Simon, Pylyshyn, and Fodor). For others, the brain is a set of networks of

neurons, each of which computes its own function (e.g., Rumelhart and McClelland). Still others think

that computations take place in the dendrites of single neurons (e.g., Koch 1999). Some investigators

build computer simulations of cognitive phenomena studied by psychologists and argue that the

neurological details are irrelevant to understanding the computational organization of the brain (e.g.,

Newell 1990, Fodor and Pylyshyn 1988). Other investigators model neurological phenomena described

by neurophysiologists and maintain that on the contrary, it is simulating cognitive phenomena that has

nothing to do with the computational organization of the brain (e.g., Koch 1999, Dayan and Abbott 2001).

Philosophers interested in the sciences of mind and brain have generally divided into

computationalists and anti-computationalists. The former believe computationalism to be the best

scientific theory of the brain, or even the only genuinely scientific theory of the brain. They have offered

explications of computationalism and defended them on the grounds that they solve, or contribute to

vii

solve, important philosophical problems. Anti-computationalists often believe that computationalism is

absurd or false on a priori grounds, and have offered a number of objections to it (e.g., Searle 1980,

Penrose 1994). But anti-computationalists fiercely disagree on what is misguided about

computationalism.

At present, there is no consensus on how to formulate computationalism precisely or adjudicate

the dispute between computationalism and its foes, or between different versions of computationalism.

An important reason for the current impasse is the lack of a satisfactory philosophical account of

computing mechanisms. The main goal of this dissertation is to offer such an account. I also believe that

the history of computationalism sheds light on the current debate. By tracing different versions of

computationalism to their common historical origin, we can see how the current divisions originated and

understand their motivation. Reconstructing debates over computationalism in the context of their own

intellectual history can contribute to philosophical progress on the relation between brains and computing

mechanisms and help determine how brains and computing mechanisms are alike, and how they differ.

Accordingly, my dissertation is divided into a historical part, which traces the early history of

computationalism, and a philosophical part, which offers an account of computing mechanisms.

A good account of computing mechanisms can be used to express more explicitly and precisely

the content of different versions of computationalism, thereby allowing a more rigorous assessment of the

evidence for or against different versions of computationalism. Besides grounding discussions of

computationalism, there is independent motivation for an account of computing mechanisms. Given the

importance of computers within contemporary society, philosophical attention has been directed at them

in recent years (Floridi 1999). An account of computing mechanisms contributes to the emerging field of

philosophy of computation.

The first step towards an account of computing mechanisms is to put on the table the relevant

notion of computation. There is consensus that the notion of computation that is relevant to

computationalism, as well as to modern computer science and technology, is the one analyzed by Alan

Turing in terms of Turing Machines. Turing Machines are a mathematical formalism for manipulating

viii

symbols in accordance with fixed instructions so as to generate certain output strings of symbols from

input strings of symbols. Turing’s work on Turing Machines, together with work by other authors in the

1930s on the same notion of computability, led to the development of the classical mathematical theory of

computability. The notion of Turing Machine, together with some important results of computability

theory, is briefly reviewed in Chapter 1.

By building on Turing’s notion of computability, Warren McCulloch and others developed

computationalism in the 1940s. Chapters 2 through 6 describe how this happened. Chapter 2 describes

the pre-Turing background to McCulloch’s work. Chapter 3 describes McCulloch’s efforts to formulate a

mechanistic theory of mind and the impact of Turing’s work on his effort. Chapter 4 is a detailed analysis

of the first—and in my view the most influential—formulation of contemporary computationalism,

McCulloch and Pitts’s “A Logical Calculus of the Ideas Immanent in Nervous Activity.” Chapter 5 and 6

describe early discussions (1943-1946) between McCulloch and others pertaining to brains, computers,

and their mutual relations. This historical background paves the way for the philosophical part of this

dissertation, which is contained in the remaining four chapters.

The foundation of the classical theory of computability is the thesis that any function that is

effectively calculable (in an intuitive sense) is computable by some Turing Machine. This thesis, called

Church-Turing thesis (CT), has given rise to three relevant streams of philosophical literature. First,

some philosophers of mind have assumed CT to be true and used it as a priori evidence for

computationalism. Second, some logicians and philosophers of mathematics have debated whether CT is

true and what its proper scope is. And third, some physicists and philosophers of physics have debated

whether CT applies to the physical world. These three streams of literature have proceeded largely

independently of one another, with regrettable effects. In particular, arguments from CT to

computationalism have not taken into account the best scholarship on CT’s proper scope and the way CT

applies to the physical world.

To remedy this situation, Chapter 7 begins to bring these three streams of literature together. It

clarifies the proper scope of CT, the sense in which CT applies to the physical world, and proceeds to

ix

assess arguments from CT to computationalism. It concludes that all such arguments are unsound,

because CT—when properly understood—does not establish whether cognitive or neural processes (or

any other processes) are computations.

After this assessment of CT’s relevance to computationalism, the road is clear to discuss two

philosophical topics that have heavily affected both discussions of computationalism and the way

computing mechanisms are understood in the philosophical literature. These topics are the mind-body

problem and the problem of giving a naturalistic explanation of intentionality. The next two chapters are

devoted to how computationalism and computing mechanisms relate to these topics.

First, I discuss the relevance of the mind-body problem to computationalism. Computational

functionalism is the thesis that the mind is the software of the brain, which entails that the brain is a

computing mechanism (for running mental programs). Computational functionalism was proposed as a

solution to the mind-body problem (Putnam 1967b), and became very influential. As a result, many

philosophers became convinced that computationalism is a consequence of this popular solution to the

mind-body problem.

In Chapter 8, I argue that this is not the case. First, I employ the language of functional analysis

to explicate the idea that the mind is a program running on the brain. Then I distinguish functionalism,

namely the thesis that the mind is the functional organization of the brain given by some functional

analysis of the brain (which may or may not ascribe computations to it), from the stronger computational

functionalism, namely the thesis that the functional organization of the brain is given by a (mental)

computer program. I argue that none of the arguments given for functionalist (including computational

functionalist) solutions to the mind-body problem offer any support for the conclusion that the mind is a

program running on the brain. Accordingly, I argue that this latter view should be considered as an

empirical hypothesis about the kind of functional analysis that applies to minds and brains. As a

consequence, computational functionalism should be considered as the conjunction of a functionalist

solution to the mind-body problem and the empirical hypothesis that the brain is functionally organized in

accordance with a mental program.

x

After the mind-body problem, I discuss the relevance to computationalism of the naturalistic

explanation of intentionality, i.e. the problem of finding a naturalistic explanation for the content of

mental states. The semantic view of computational states is the view that inputs, internal states, and

outputs of computing mechanisms have their content essentially, i.e., computational inputs, outputs, and

internal states can be identified only by reference to their semantic properties. Almost all

computationalist philosophers believe the semantic view of computational states. If the semantic view of

computational states is correct, it gives some hope for a naturalistic explanation of intentionality. For on

the one hand, no one doubts that computing mechanisms are natural objects that can be given naturalistic

explanations. On the other hand, the semantic view of computational states entails that computationalism

ascribes to the brain states that are essentially endowed with content. If the essentially contentful states

ascribed to the brain by computationalism coincide with the mental states whose intentionality many

philosophers would like to explain naturalistically, then computationalism offers the basis for a

naturalistic explanation of intentionality. This has acted as a powerful motivation for computationalism.

In Chapter 9, I reject the semantic view of computational states in favor of the view that

computational inputs, outputs, and internal states are identified by their functional properties, as described

by a specific kind of functional analysis. This view fits better than the semantic view with the practices of

computability theorists and computer designers, but it undercuts one of the main traditional philosophical

motivations for computationalism.

The two main ideas developed in Chapters 8 and 9 are that (1) computational states are to be

identified functionally not semantically, and (2) computing mechanisms are to be studied by functional

analysis. These ideas come to fruition in Chapter 10, where the relevant kind of functional analysis is

spelled out. The resulting account of computing mechanism, which I call the functional account of

computing mechanisms, can be used to identify computing mechanisms and the functions they compute. I

use the functional account of computing mechanisms to taxonomize computing mechanisms based on

their different computing power, and I use this taxonomy of computing mechanisms to taxonomize

different versions of computationalism based on the functional properties that they ascribe to brains. By

xi

doing so, I begin to tease out empirically testable statements about the functional organization of the brain

that different versions of computationalism are committed to. I submit that when computationalism is

reformulated in the more explicit and precise way I propose, the disputes about computationalism can be

adjudicated on the grounds of empirical evidence from neuroscience.

xii

ACKNOWLEDGEMENTS

My greatest debt is to Peter Machamer, my advisor. The ways in which he helped me are too numerous

to enumerate. My committee members, John Earman, Bard Ermentrout, Paul Griffiths, and John Norton,

gave me the right balance of constructive criticism and support. John Norton also guided me through my

initial search for a dissertation project.

I wrote my first paper on this topic for a class I took with Ken Manders. He kindly advised me to

pursue my investigations further. At the time, Carl Craver was writing his dissertation on mechanisms in

neuroscience. I jokingly asked Carl to do a good job so I could use his conclusions in my research. I’m

pleased to say that I was influenced by Carl’s dissertation as well as his subsequent work (e.g.,

Machamer, Darden and Craver 2000, Craver 2001a). Another important early influence on my

dissertation was Wilfried Sieg’s work on the history and philosophy of computation (e.g., Sieg 1994). In

the last decade, Sieg and Jack Copeland have done more than anyone before them to clarify the Church-

Turing thesis and fight the misconceptions about the Church-Turing thesis that pervade the philosophy of

mind literature. My work, especially in Chapter 7, builds on theirs. While working on Chapters 8

through 10, I gained the most insight into computational theories of mind and brain by reading the works

of Jerry Fodor. If I have succeeded at all in moving the debate forward, I owe it to a large extent to how I

responded to Fodor’s writings.

A number of people have given me feedback on parts of my dissertation or related material: Bob

Brandom, Jack Copeland, Carl Craver, Reinaldo Elugardo, Uljana Feest, Clark Glymour, Rick Grush,

Graham Hubbs, Ken Manders, Diego Marconi, Jim Moor, Bob Olby, Anastasia Panagopoulos, Elizabeth

Paris, Merrilee Salmon, Andrea Scarantino, Susan Schneider, Oron Shagrir, Wilfried Sieg, Susan Sterrett,

Julie Yoo, and Julie Zahle. If I have forgotten anyone, I apologize.

Other people helped me by conversing or corresponding with me on topics related to my

dissertation. They include Erik Angner, Robert S. Cox, John Haugeland, Lance Lugar, Valerie-Anne

xiii

Lutz, Jay McClelland, Wendy Parker, and Martha Pollack. Again, I apologize to anyone I inadvertently

omitted.

As a graduate student unknown to them, I wrote or approached Brian Davies, Daniel Dennett,

Jerry Fodor, Gilbert Harman, Hilary Putnam, Dave Touretzky, and Bernard Widrow concerning my

research. They generously responded with helpful remarks for which I am grateful.

I have presented parts of my dissertation to various philosophical audiences. Thanks to those

present for their attention and responses.

I’d like to thank all my philosophy teachers from high school to college to graduate school, as

well as all others who taught me the little I know about the ways of philosophy and of life. Without their

example and encouragement, I would not be where I am.

Finally, I am grateful to Becka Skloot for her help and support during all these years.

My research was supported in part by the National Science Foundation under Grant No. SES-

0216981, by an Adelle and Erwin Tomash Fellowship, by an Andrew Mellon Predoctoral Fellowship, and

by a Regione Sardegna Doctoral Scholarship. I am grateful to those institutions, the administrators who

run them, the politicians who back them, and the taxpayers who ultimately fund them. Any opinions,

findings, conclusions, and recommendations expressed in this dissertation are those of the author and do

not necessarily reflect the views of these funding institutions.

xiv

TABLE OF CONTENTS

PREFACE..................................................................................................................................... vii

ACKNOWLEDGEMENTS.........................................................................................................xiii

1 COMPUTABILITY................................................................................................................ 1

1.1 Effective Calculability .................................................................................................... 1

1.2 Computability Theory..................................................................................................... 4

1.2.1 Notation................................................................................................................... 4

1.2.2 Recursive Functions................................................................................................ 5

1.2.3 Turing Machines ..................................................................................................... 7

1.2.4 Gödel Numbers of TM Programs ......................................................................... 11

1.2.5 Universal TM programs........................................................................................ 13

1.2.6 Unsolvability of the Halting Problem................................................................... 13

1.3 The Church-Turing Thesis............................................................................................ 14

2 WARREN MCCULLOCH ON LOGIC, MIND, AND BRAIN, CA. 1920-1936 ............... 15

2.1 Introduction................................................................................................................... 15

2.2 Background................................................................................................................... 16

2.3 Logic, Epistemology, and the Brain ............................................................................. 19

2.4 Strychnine Neuronography and the Functional Organization of the Brain .................. 25

3 TOWARDS A THEORY OF THE BRAIN, 1936-1942 ...................................................... 30

3.1 What Computing Mechanisms Can Do ........................................................................ 30

3.2 Teleological Mechanisms ............................................................................................. 37

3.3 Walter Pitts.................................................................................................................... 42

3.4 McCulloch Meets Pitts.................................................................................................. 44

4 BRAINS COMPUTE THOUGHTS, 1943 ........................................................................... 46

4.1 A Mechanistic Theory of Mind..................................................................................... 46

4.2 Motivation..................................................................................................................... 49

4.3 Assumptions.................................................................................................................. 50

4.4 Nets Without Circles..................................................................................................... 52

4.5 Nets With Circles, Computation, and the Church-Turing Thesis................................. 55

4.6 “Consequences” ............................................................................................................ 58

4.7 The Historical Significance of McCulloch-Pitts Nets .................................................. 61

5 FROM BRAINS TO COMPUTERS AND BACK AGAIN, 1943-1945 ............................. 65

5.1 Migrations..................................................................................................................... 65

5.2 Brains and Computers................................................................................................... 68

5.3 Electronic Brains........................................................................................................... 71

5.4 A Research Program ..................................................................................................... 77

5.5 Preparing for the End of the War.................................................................................. 84

6 THE NEW SCIENCE OF BRAINS AND MACHINES, 1946............................................ 89

6.1 The First Macy Meeting................................................................................................ 89

6.2 The Next Generation..................................................................................................... 96

6.3 More Meetings............................................................................................................ 100

xv

6.4 Von Neumann’s New Thoughts on Automata............................................................ 105

6.5 Importance of the Computationalist Network............................................................. 109

7 COMPUTATIONALISM AND THE CHURCH-TURING THESIS ................................ 112

7.1 Introduction................................................................................................................. 112

7.2 The Church-Turing Thesis.......................................................................................... 113

7.2.1 The Canonical View: CT is True but Unprovable (Kleene 1952, § 62, § 67).... 113

7.2.2 Optimistic View 1: CT is True and Provable (Mendelson 1990) ....................... 114

7.2.3 Optimistic View 2: CT is True Because Entailed by Physical Facts (Deutsch

1985) ............................................................................................................................. 116

7.2.4 The Gandy-Sieg View (Gandy 1980; Sieg 2000)............................................... 116

7.2.5 Pessimistic View 1: CT is False Because Contradicted by Non-uniform Effective

Procedures (Kálmar 1959) .................................................................................................. 119

7.2.6 Pessimistic View 2: CT is False Because Contradicted by Non-mechanical

Effective Procedures (Gödel 1965, 1972)........................................................................... 120

7.2.7 Pessimistic View 3: CT is False Because Contradicted by Physical Facts (Hogarth

1994) ............................................................................................................................. 121

7.2.8 Other Objections to CT....................................................................................... 122

7.3 Physical CT................................................................................................................. 123

7.3.1 Modest Physical CT............................................................................................ 123

7.3.2 Hypercomputation............................................................................................... 126

7.3.3 Bold Physical CT ................................................................................................ 132

7.3.4 Between Modest and Bold Physical CT ............................................................. 135

7.3.4.1 Mathematical Tractability............................................................................... 135

7.3.4.2 Computational Approximation ....................................................................... 136

7.4 Computationalism and CT.......................................................................................... 139

7.4.1 By Physical CT ................................................................................................... 140

7.4.1.1 By Modest Physical CT .................................................................................. 141

7.4.1.2 By Bold Physical CT ...................................................................................... 142

7.4.2 Cognition as an Effective Procedure................................................................... 143

7.4.3 Effective Procedures as a Methodological Constraint on Psychological Theories...

............................................................................................................................. 146

7.5 Conclusion .................................................................................................................. 149

8 COMPUTATIONAL FUNCTIONALISM ........................................................................ 151

8.1 Introduction................................................................................................................. 151

8.2 Multiple Realizability and Computational Functionalism.......................................... 155

8.3 Multiple Realizability, Functional Analysis, and Program Execution ....................... 157

8.3.1 Multiple Realizability and Functional Analysis ................................................. 160

8.3.2 Multiple Realizability and Explanation by Program Execution ......................... 160

8.3.3 Functional Analysis and Program Execution...................................................... 161

8.3.4 Computational Functionalism Revisited............................................................. 163

8.4 Origin of Computational Functionalism ..................................................................... 166

8.4.1 The Brain as a Turing Machine .......................................................................... 166

8.4.2 The Analogy between Minds and Turing Machines........................................... 168

8.4.3 Psychological Theories, Functional Analysis, and Programs............................. 170

8.4.4 Functionalism...................................................................................................... 173

8.4.5 Computational Functionalism............................................................................. 174

xvi

8.4.6 Functional Analysis and Explanation by Program Execution ............................ 176

8.5 Later Developments of Functionalism........................................................................ 180

8.6 Is Everything a TM? ................................................................................................... 183

8.7 How Did This Happen? .............................................................................................. 195

8.8 Functionalism and Computationalism ........................................................................ 198

9 COMPUTATION AND CONTENT .................................................................................. 200

9.1 Introduction................................................................................................................. 200

9.2 The Functional View of Computational States........................................................... 203

9.3 Against the Semantic View of Computational States................................................. 209

9.4 Origins of the Semantic View of Computational States............................................. 217

9.4.1 Content in Early Computationalism.................................................................... 218

9.4.2 Conceptual Role Semantics ................................................................................ 221

9.4.3 Computationalism and the Philosophy of Mind ................................................. 222

9.4.4 The Semantic View of Computational States in the Philosophy of Mind.......... 225

9.5 Computationalism and Theories of Content ............................................................... 228

9.5.1 CTM meets Conceptual Role Semantics ............................................................ 228

9.5.2 CTM meets Interpretational Semantics .............................................................. 231

9.5.3 CTM meets Informational and Teleological Semantics ..................................... 236

9.5.4 CTM meets Intentional Eliminativism................................................................ 240

9.6 CTM With or Without Semantics............................................................................... 243

9.7 Two Consequences ..................................................................................................... 245

10 COMPUTING MECHANISMS ..................................................................................... 248

10.1 Introduction................................................................................................................. 248

10.1.1 Desiderata for an Account of Computing Mechanisms...................................... 249

10.2 The Functional Account of Computing Mechanisms ................................................. 251

10.2.1 Primitive Computing Components ..................................................................... 253

10.2.1.1 Comparison with Cummins’s Account....................................................... 259

10.2.2 Primitive Non-computing Components .............................................................. 261

10.2.3 Complex Computing Components...................................................................... 262

10.2.3.1 Combinational Computing Components..................................................... 263

10.2.3.2 Arithmetic Logic Units ............................................................................... 267

10.2.3.3 Sequential Computing Components ........................................................... 267

10.2.3.4 Multiplication and Division Components................................................... 268

10.2.3.5 The Computation Power of Complex Computing Components................. 269

10.2.4 Complex Non-computing Components .............................................................. 271

10.2.4.1 Memory Units............................................................................................. 271

10.2.4.2 Datapaths..................................................................................................... 272

10.2.4.3 Control Units............................................................................................... 274

10.2.4.4 Input and Output Devices ........................................................................... 276

10.2.4.5 Internal Semantics....................................................................................... 276

10.2.5 Calculators .......................................................................................................... 278

10.2.6 Computers........................................................................................................... 280

10.2.6.1 Programmability ......................................................................................... 281

10.2.6.2 Stored-Program Computers ........................................................................ 283

10.2.6.3 Special-Purpose, General-Purpose, or Universal........................................ 285

10.2.6.4 Functional Hierarchies................................................................................ 286

xvii

10.2.6.5 Digital vs. Analog ....................................................................................... 289

10.2.6.6 Serial vs. Parallel ........................................................................................ 294

10.3 Comparison With Previous Accounts of Computing Mechanisms ............................ 296

10.3.1 Putnam ................................................................................................................ 297

10.3.2 Cummins............................................................................................................. 299

10.3.3 Fodor................................................................................................................... 301

10.3.4 The Functional Account and the Six Desiderata................................................. 302

10.4 An Application: Are Turing Machines Computers?................................................... 305

10.5 A Taxonomy of Computationalist Theses .................................................................. 306

10.6 Questions of Hardware ............................................................................................... 308

10.7 Conclusion .................................................................................................................. 310

BIBLIOGRAPHY....................................................................................................................... 311

xviii

LIST OF FIGURES

Figure 4-1. Diagrams of McCulloch and Pitts nets...................................................................... 54

Figure 4-2. Net explaining heat illusion ...................................................................................... 55

Figure 10-1. A NOT gate, an AND gate, and an OR gate. ........................................................ 255

Figure 10-2. Half (two-bit) adder............................................................................................... 264

Figure 10-3. Full (two-bit) adder. .............................................................................................. 265

Figure 10-4. The main components of a computer and their functional relations..................... 280

xix

1 COMPUTABILITY

1.1 Effective Calculability

This chapter introduces some fundamental notions related to computation, which will be used throughout

the dissertation. This first section is devoted to the pre-theoretical notion of effective calculability, or

computability by an effective procedure (computability for short). This informal notion motivates the

formally defined notion of Turing-computability, which I introduce in the following section. In the last

section, I briefly introduce the Church-Turing thesis, which says that the formal notion is an adequate

formalization of the informal one.

During the first decades of the 20th century, mathematicians’ interest in computable functions lay

in the foundations of mathematics. Different philosophical approaches were proposed. L. E. J. Brouwer

was the main supporter of intuitionism, according to which an existence proof for a mathematical object

was admissible only if constructive (Brouwer 1975). David Hilbert proposed his proof theory to

formalize in axiomatic fashion mathematical reasoning in an attempt to establish the foundations of

mathematics without endorsing Brouwer’s restrictions (Hilbert 1925, 1927, reprinted in van Heijenoort

1967; Hilbert and Ackermann 1928). This formalization allowed Hilbert to formulate rigorously the

decision problem for first-order logic. A decision problem requests a method for answering a yes-or-no

question concerning a domain of objects: “Given any sequence of symbols, is it a formula?” or “Given

any formula, is it provable?” A solution to a decision problem is an effective procedure: a uniform

method or algorithm specified by a finite set of instructions, by which any instance of the question can be

answered in a finite number of steps. “Effective procedure” is a term used by some mathematicians in

place of “algorithm”; an effective procedure cannot appeal to non-extensionally definable capacities like

intuition, creativity, or guesswork, and it always generates the correct answer.1

1 After the work of Turing and others, “effective procedure” was also used for procedures not guaranteed to generate

all values of a total function, that is, they calculated only the values of a partial function (cf. Wang 1974, p. 84).

1

Lacking a rigorous definition of “effective procedure,” mathematicians called it an “intuitive

concept” to distinguish it from formally defined mathematical concepts.2 Kurt Gödel proposed replacing

“effective procedures” with a rigorously defined concept, that of “recursive functions,” but he didn’t rule

out that some effective procedures might not be included within recursive functions (1931, 1934).

Alonzo Church (1936) and Alan Turing (1936-7) strengthened Gödel’s tentative identification of effective

procedures and recursive functions to a general thesis, now called the Church-Turing thesis. Based on the

Church-Turing thesis, Church and Turing proved that some functions are not computable. For example,

Turing pointed out that he and Church used different definitions but reached “similar conclusions,” i.e.,

that “the Hilbertian Entscheidungsproblem [i.e., the decision problem for first-order logic] can have no

solution” (Turing 1936-7, 116, 117, 145).3

The notion of effective procedure can be informally defined as a procedure with the following

properties4:

1) It uses a finite number of primitive operations for a finite number of times specified by a finite

number of deterministic instructions (i.e. instructions whose execution yields a unique next step in the

procedure).

2) Instructions are non-ambiguous and finite in length.

3) The procedure requires no intuitions about the subject matter (e.g. intuitions about properties of

numbers), no ingenuity, no invention, no guesses.

4) For any argument of the function computed by the procedure, the procedure is the same (uniformity).

5) If the procedure terminates, for any argument of the function computed by the procedure, the

procedure generates the correct value.

About the role of effective procedures in foundations of mathematics, especially Hilbert’s approach, see Hallett

1994, Sieg 1994, Shapiro 1983, 1995. For a history of foundations of mathematics, see Webb 1980, Mancosu 1998.

2 To refer to the intuitive notion of effective procedure, different authors used different terms. Instead of

“procedure,” some used “process,” “method,” or “rule.” Instead of “effective,” some used “finite,” “finite

combinatorial,” “mechanical,” “definite,” “constructively defined,” or “algorithmic.” Some of the terms used as

synomyms of “effectively calculable” are listed by Gödel 1965, 72; Kleene 1987a, pp. 55-56.

3 On the origin of CT and recursive function theory, see Davis 1982; Gandy 1988; Kleene 1979, 1987a; Sieg 1994,

1997; Piccinini 2003a.

4 The properties are somewhat redundant, but are kept separate for explicitness.

2

When we have a formalized language in which both a domain and operations over objects in that

domain are formally defined, we can talk about lists of formalized instructions and call them programs.

Programs are the formal replacement of algorithms or effective procedures. Because of this, programs are

said to implement algorithms (or procedures).

Not all mathematical procedures are effective procedures or algorithms. It may be possible to

specify sequences of operations that are not guaranteed to find all the values of a function for every

argument. These procedures, called heuristics, generate a search for the desired value: the search may

find the value of the function being computed or it may find an output that only approximates that value.

A caveat needs to be added about the relationship between programs (and procedures) and the

functions they compute. A program is a definition of a function, from its inputs to its outputs; when the

program doesn’t halt, the function isn’t defined. So, by definition any program computes the values of

the function that it defines: it implements an algorithm or effective procedure relative to that function.

However, typically a program is written to find values for a function that is defined independently of the

program’s existence. In such a case, the program may or may not be implementing an algorithm that

finds the values of that function. Many programs do not always find the values of the independently

defined function they are designed to compute, but rather they find approximations to those values. In

such cases, relative to the independently defined functions, those programs are said to implement not

algorithms but heuristics.5

5 The philosophical literature is not always clear on this point. For example:

The possibility of heuristic procedures on computers is sometimes confusing. In one sense, every digital

computation (that does not consult a randomizer) is algorithmic; so how can any of them be heuristic? The

answer is again a matter of perspective. Whether any given procedure is algorithmic or heuristic depends

on how you describe the task (Haugeland 1997, p. 14).

But whether a procedure (or program) is algorithmic or heuristic does not depend on how one describes its task.

Relative to its task, a procedure is algorithmic or heuristic depending on whether or not it is guaranteed to solve each

instance of the task. Instead, a program is always algorithmic with respect to the generation of its outputs given its

inputs.

Another example of confusion about this point is manifested by Dennett’s statement (1975, p. 83) that

human beings may not be Turing Machines (TMs), because humans may be implementing heuristics rather than

algorithms. This presupposes that TMs implement only algorithms and not heuristics. Now, it is true that every TM

implements an algorithm that generates its outputs given its inputs. But relative to the problem TMs are designed to

solve, TMs—like any other computing mechanisms—may well be implementing heuristics.

3

1.2 Computability Theory

This section reviews some notions and results of classical computability theory. Computability theory

studies what functions are computable, what mathematical properties they have, and the mathematical

properties of computing mechanisms. Computable functions are identified with the class of recursive

functions, inductively defined. As a specific example of computing mechanism, I will give Turing

Machines. Using Turing Machines and recursive functions, I will introduce the notion of universal

computing mechanism and the unsolvability of the halting problem.

1.2.1 Notation

{a1, a2, ... an} Set of n objects a1, ... an

(a1, a2, … an) List (or n-tuple) of n objects a1, … an

a ∈ A a is an element of set A

N Set of natural numbers 0, 1, 2, …

f: A1 → A2 f is a function from A1 to A2

Domain of f Set of all a such that (a, b) ∈ f for some b

Range of f Set of all of f(a) for a in the domain of f

Partial function on A Function whose domain is a subset of A

Total function on A Function whose domain is A

Alphabet Nonempty set Σ of objects called symbols

Word or string on Σ List of symbols on Σ, (instead of (a1, a2, … an), we write a1a2…an)

|u| = n, where u = a1a2… an n is the length of u

a1n concatenation of n symbols a1

Σ* Set of all words on alphabet Σ

Language on Σ Any subset of Σ*

uv, where u, v ∈ Σ* Concatenation of u and v

4

Predicate on a set A A total function P: A → N such that for each a ∈ A, either P(a) = 1 or

P(a) = 0, where 1 and 0 represent truth values

R = {a ∈ A|P(a)}, P a predicate on A R is the set of all a ∈ A such that P(a) = 1; P is called the

characteristic function of R

Pr (k) kth prime in order of magnitude

ΨMn(x1, …xn) n-ary function computed by TM program M; when n = 1 we omit n

Computability theory applies to general word functions f: Σ* → Σ’*, where Σ* is the set of all

words on alphabet Σ. Since words can be effectively encoded as natural numbers and vice versa (see

section 1.2.4 below for an example of such an encoding), in this section we follow the standard

convention of developing the theory with respect to number-theoretic functions f: N → N, without loss of

generality.6 Hence in this section, unless otherwise specified, “number” means natural number, and

“function” means function on natural numbers. For the exposition of the material in this section, I drew

mostly from Davis 1958 and Davis et al. 1994.

1.2.2 Recursive Functions

This section introduces the definition of the primitive recursive functions on the basis of three primitive

base functions and two primitive operations. Then, by means of one further primitive operation, the class

of partial recursive functions is defined.

The class of primitive recursive functions is defined inductively as follows.

Base functions:

Null function. n(x) = 0.

Successor function. s(x) = x + 1.

Projection functions. uin(x1, … xn) = xi.

Operations:

6 Computability theory can also be developed directly in terms of string functions (Machtey and Young 1978). This

possibility will be significant in the Chapter on Computation and Content.

5

Composition. Let f be a function of k variables and let g1, …gk be functions of n variables. Let:

h(x1, … xn) = f(g1(x1, … xn), … gk(x1, … xn)).

Then h is obtained from f and g1, … gk by composition.

Primitive recursion. Let f be a function of n variables and let g be a function of n+2 variables. Let:

h(x1, … xn, 0) = f(x1, … xn)

h(x1, … xn, t+1) = g(t, h(x1, … xn, t), x1, … xn)

Then h is obtained from f and g by primitive recursion.

Definition 1. A function f of n variables is primitive recursive if and only if it can be obtained

from the base functions by finitely many operations of composition and primitive recursion.

Examples of primitive recursive functions include addition, multiplication, exponentiation,

predecessor, and many other useful functions (see Davis et al. 1994, section 3.4).

In the present context, predicates are total functions whose values are 0 or 1 (representing true

and false). Any primitive recursive function whose values are 0 and 1 is called a primitive recursive

predicate. An example of a primitive recursive predicate is equality.

It can be easily shown, by induction on the definition of primitive recursive function, that every

primitive recursive function is total.

Next, we introduce a further operation:

Minimalization (unbounded). Let P be a predicate of n+1 variables. We write minyP(x1, … xn, y) for the

least value of y for which the predicate P is true if there is one. If there is no such value of y, then

minyP(x1, … xn, y) is undefined.

Unbounded minimalization of a predicate can easily produce a function that is not total. An

example is provided by subtraction:

x – y = minz(y + z = x),

which is undefined for x < y.

Definition 2. A function f is partial recursive if and only if it can be obtained from the base

functions by finitely many operations of composition, primitive recursion, and minimalization.

6

A partial recursive function that is total is called total recursive.

1.2.3 Turing Machines

Turing Machines (TMs) are perhaps the best-known computing mechanisms. They have two main

components. First, there is a two-way potentially infinite tape divided into squares; each square contains

one symbol (which may be an empty square). Second, there is an active device that can be in one of a

finite number of states. The active device acts on the tape in one of four ways: it reads the symbol on a

square, writes a symbol on a square, moves one square to the left, or moves one square to the right. TM’s

active devices operate in discrete time. At any instant, the active device reads the symbol on one of the

tape’s squares. Then, the symbol on that square and the device’s current state determine what the active

device does: what state it goes into and whether it moves left, or moves right, or writes a symbol on the

current square (and which symbol it writes). When this happens, we say that an active device responds to

its internal state and symbol on the tape. All TMs have this structure in common.

Although strictly speaking it is the active devices of TMs that perform operations (on the tape,

which is passive), for simplicity we follow the standard convention of ascribing activities to TMs tout

court. TMs are distinguished from one another by the alphabet they operate on, by the number of their

internal states, and more importantly by the particular actions they perform in response to their internal

states and the symbols on the tape. A description of the way a particular TM responds to a particular state

and symbol is here called an instruction. A set of instructions, which uniquely identifies a TM, is called a

TM program.

To avoid confusion, TMs should be kept distinct from the TM programs that describe their

behavior. Unlike digital computers, which compute by executing programs, ordinary TMs do not operate

by responding to the TM programs that describe their behavior. Ordinary TMs simply behave in the way

described by their TM programs; in other words, their behavior satisfies the instructions contained in their

TM program. A TM program identifies a computational process uniquely, and a TM that satisfies the

instructions listed in the program is its canonical implementation (i.e., the implementation given by

7

Turing). But the computations defined by TM programs can also be carried out by humans or machines

other than TMs.

Moreover, in section 1.2.4 we shall see that TM programs can be encoded using the alphabet that

TMs operate on, and then written on TM tapes. There are special TMs, called universal TMs, which can

respond to any TM program written on their tape so as to mimic the behavior of the TMs described by the

program. Since universal TMs do compute by responding to TM programs written on their tape, we say

that they execute TM programs. Needless to say, the behavior of universal TMs is also described by their

own TM programs, called universal TM programs. Universal TMs execute the programs written on their

tape, but not the universal TM programs that describe their behavior.

In formally defining TM tables, we will use the following ingredients:

Symbols denoting internal states of TMs’ active devices: q1, q2, q3, . . .

Symbols denoting symbols that TMs can print on the tape: S0, S1, S2, . . . The set of Si’s is our alphabet.

Symbols denoting primitive operations: R (move to right), L (move to left).

Expressions: finite sequences of symbols.

Instructions: expressions having one of the following forms:

(1) qi Sj Sk ql,

(2) qi Sj R ql,

(3) qi Sj L ql,

(4) qi Sj qk ql.

Quadruples of the first type mean that in state qi reading symbol Sj, the active device will print Sk and go

into state ql. Quadruples of the second type mean that in state qi reading symbol Sj, the active device will

move one square to the right and go into state ql. Finally, quadruples of the third type mean that in state qi

reading symbol Sj, the active device will move one square to the left and go into state ql.7

We are now ready to define (deterministic) TM programs, their alphabets, and their instantaneous

descriptions or snapshots:

7 Instructions of the fourth type serve to define special TMs called oracle TMs and will not be used here.

8

(Deterministic) TM program: set of instructions that contains no two instructions whose first two symbols

are the same.

Alphabet of a TM program: all symbols Si in the instructions except S0. For convenience, sometimes we

shall write S0 as B (blank), and S1 as 1.

Snapshot: expression that contains exactly one qi, no symbols for primitive operations, and is such that qi

is not the right-most symbol.

A snapshot describes the symbols on a TM tape, the position of the active device along the tape,

and the state of the active device. In any snapshot, the Si’s represent the symbols on the tape, qi

represents the state of the active device, and the position of qi among the Si’s represents the position of the

device on the tape. For any tape and any TM program at any computation step, there is a snapshot

representing the symbols written on the tape, the state of the device, and its position on the tape. At the

next computation step, we can replace the old snapshot by its successor snapshot, whose difference from

its predecessor indicates all the changes (of the tape, position, and state of the device) that occurred at that

step. A snapshot without successors with respect to a TM program M is called a terminal snapshot with

respect to that program.

Using the notion of snapshot, we can rigorously define computations by TM programs:

Computation by a TM program M: finite sequence of snapshots a1 . . . an such that 1≤i<p, ai+1 is the

successor of ai, and an is terminal with respect to M. We call an the resultant of a1 with respect to M.

For example, let M consist of the following instructions:

q1 S0 R q1,

q1 S1 R q1.

The following are computations of M, whose last line is the resultant of the first line with respect to M:

(1) q1S0S0S0

9

S

0q1S0S0

S0S0q1S0

S0S0S0q1

(2) q1S1S1S1

S1q

1S1S1

S1S1q1S1

S1S1S1q1

(3) q1S1S0

S

1q1S0

S1S0q1.

With each number n we associate the string n = 1n+1. Thus, for example, 4 = 11111. With each

k-tuple (n1, n2, …nk) of integers we associate the tape expression (n1, n2, …nk), where:

(n1, n2, …nk) = n1Bn2B …Bnk.

Thus, for example, (1, 3, 2) = (1, 3, 2) = 11B1111B111.

Given an initial snapshot and a program, either there is a computation or there isn’t (if there isn’t,

it’s because the list of snapshots is infinite).

Definition 3. An n-ary function f(x1, …xn) is Turing-computable if and only if there is a Turing

Machine M such that: f(x1, …xn) is defined if and only if there is a computation of M whose first

snapshot is q1(x1, … xn) and whose resultant contains nj+1 occurrences of the symbol 1, where f(x1, …xn)

= nj. We write:

f(x1, …xn) = ΨMn(x1, …xn).

10

Turing-computability and partial recursiveness are equivalent notions in the following sense. A

function is partial recursive if and only if it is Turing-computable, and it is total recursive if and only if it

is a total Turing-computable. In one direction, this is shown by constructing TM programs computing

each of the base functions and by showing that TM programs can be manipulated in ways corresponding

to the three operations (for the details of the construction, see Davis 1958). The other direction is

addressed in the following section.

1.2.4 Gödel Numbers of TM Programs

One way to develop the theory of TM programs is by using recursive functions. I use a method,

developed by Gödel (1931), that allows us to use natural numbers as a code for TM instructions, and

therefore for TM programs. By studying the properties of TM programs in this way, we will demonstrate

the results that we are interested in, namely the existence of universal TMs and the unsolvability of the

halting problem. The method followed here has the great advantage of avoiding long and laborious

mathematical constructions.

The basic symbols used in formulating TM programs are the following:

R, L

S0, S1, S2, …

q1, q2, q3, …

We associate each of these symbols to an odd number ≥ 3, as follows:

3 R

5 L

7 S0

9 q1

11 S1

13 q2

etc.

11

Hence, for any expression M there is now a finite sequence of odd integers a1, a2, … an associated to M.

Now we’ll associate a single number with each such sequence and hence with each expression.

Definition 4. Let M be an expression consisting of the symbols a1, a2, … an. Let b1, b2, … bn be

the corresponding integers associated with these symbols. Then the Gödel number of M is the following

integer:

n

r = ∏ Pr (k)ak

k=1

We write gn(M) = r, and M = Exp (r). If M is the empty expression, we let gn(M) = 1.

Definition 5. Let M1, M2, … Mn be a finite sequence of expressions. Then, the Gödel number of

this sequence of expressions is the following integer:

n

r = ∏ Pr (k)gn(Mk)

k=1

It is easy to prove that any expression and any sequence of expressions have a unique Gödel

number. Since TM programs are sets of instructions not lists of them, any TM program consisting of n

instructions has n! Gödel numbers.

Definition 6. For each n>0, let Tn(z, x1, … xn, y) be the predicate that means, for given z, x1, …

xn, y, that z is a Gödel number of a TM program Z, and that y is the Gödel number of a computation, with

respect to Z, beginning with snapshot q1(x1, … xn).

These predicates express the essential elements of the theory of TM programs.

Davis 1958 contains the detailed construction proving that for each n>0, Tn(z, x1, … xn, y) is

primitive recursive, that every Turing-computable function is partial recursive, and that every total

Turing-computable function is total recursive.

12

1.2.5 Universal TM programs

We are now ready to demonstrate that there are universal TMs, which compute any function computable

by a TM. Consider the partial recursive binary function f(z, x) = u11(minyT(z, x, y)). Since this function

is Turing-computable, there is a TM program U such that:

ΨU2(z, x) = f(z, x)

This program is called universal TM program. It can be employed to compute any partially computable

(singulary; but generalizable to n-ary) function as follows: If Z0 is any TM program and if z0 is a Gödel

number of Z0, then:

ΨU2 (z0, x) = ΨZ0(x)

Thus, if the number z0 is written on the tape of U, followed by the number x0, U will compute the number

ΨZ0 (x0).

1.2.6 Unsolvability of the Halting Problem

We now discuss the function HALT(x, y), defined as follows. For a given y, let P be the TM program

such that gn(P) = y. Then HALT(x, y) = 1 if ΨP(x) is defined and HALT(x, y) = 0 otherwise. In other

words HALT(x, y) = 1 if and only if the TM program with Gödel number y eventually halts on input x,

otherwise it’s equal to 0. We now prove the unsolvability of the halting problem.

Theorem. HALT(x, y) is not a recursive function.

Proof. Define the total function g(x) = HALT(x, x), and the partial function h(x) = 0 if g(x) = 0, h(x)

undefined if g(x) = 1. If h is partial recursive, then there is a TM program P’ with Gödel number i such

that for all x, h(x) = ΨP’(x). But then:

h(i) = ΨP’(i) = 0 if and only if g(i) = 0 if and only if ΨP’(i) is undefined,

which is a contradiction. Therefore, h cannot be partial recursive, so that g and hence HALT cannot be

total recursive. QED

13

This theorem gives us an example of a function that is not computable by a TM program.

Computability theory shows that there are infinitely many such functions. Assuming the truth of the

Church-Turing thesis, which will be discussed in the next section, we conclude that there is no algorithm

computing the halting function. The same holds for any other non-Turing-computable total function.

1.3 The Church-Turing Thesis

Turing (1936-7) introduced his machines as a way to make precise the informal notion of algorithmic

computability or effective calculability, as I introduced them in the first section. Church (1936) proposed

a similar thesis using recursive functions, which, as we’ve seen, are computationally equivalent to TMs.

After its proponents, Stephen Kleene (1952) dubbed this the Church-Turing thesis:

(CT) A function is effectively calculable if and only if it is Turing-computable.

CT is generally accepted among mathematicians and computer scientists on what they consider

overwhelming evidence in its favor.

In summary, this chapter introduced the informal notion of effective calculability and its formal

counterpart—Turing computability. According to the canonical view, CT connects the informal notion of

effective calculability, or computability by effective procedure, with the formal one of Turing-

computability. There can be no rigorous proof of CT, but there is overwhelming evidence in its favor. In

Chapter 7, I will list the evidence for CT, discuss the most important alternatives to this canonical view,

and I will conclude with a cautious endorsement of a clarified version of the canonical view. I will then

discuss the relevance of CT for computationalism.

Before we get to the detailed discussion of CT, we shall take a close look at how in the 1940s, the

notions described in this chapter were used to formulate a novel theory of the brain, computationalism,

according to which the brain is a mechanism that performs computations.

14

2 WARREN MCCULLOCH ON LOGIC, MIND, AND BRAIN, CA. 1920-1936

2.1 Introduction

In the 1940s, Turing’s notion of computability and of a universal computing mechanism—which were

reviewed in Chapter 1—played an important role in the history of computationalism, namely the view that

the brain is a computing mechanism. The central figure in the development of computationalism was

neurophysiologist and psychiatrist Warren McCulloch. Other key figures, besides Turing and

McCulloch, were mathematicians Norbert Wiener and John von Neumann. This chapter begins to tell

how McCulloch and others developed computationalism.

Despite the centrality of computationalism to many disciplines, its early historical development,

which took place during the early 1940s, has received little attention and remains poorly understood.

Most of the existing work focuses on symbolic artificial intelligence (AI) and the cognitivist movement

since the late 1950s (McCorduck 1979, Gardner 1985, Crevier 1993). This research pays little attention

to the origin of computationalism in the 1940s and early 1950s. One partial exception is the work of

Dupuy (2000), who urges historians to look at the cybernetic movement as the source of

computationalism. But Dupuy’s work is more a philosophical argument in favor of a brand of cognitive

science based on the study of “complexity” than a scholarly study of the history of computationalism (cf.

Piccinini 2002).

Outside of the history of cognitive psychology and symbolic AI, there is some work on the

relation between computationalism and logic (Webb 1980), but nothing has been written on

computationalism in neuroscience, even though computationalism was originally proposed and discussed

by a neurophysiologist as a theory of the brain. Another limitation of the existing literature is that it often

addresses the origin of computationalism by listing the ideas of intellectual-heroes like Hobbes, Leibniz,

and Babbage, usually culminating in Turing’s contributions (the best example of this kind is Davis 2000).

It’s time to move beyond this genius-centered history and study how the dreams of those thinkers

15

eventually turned into computationalism as a conceptual framework for several research programs, which

involved the creation of new scientific communities and institutions, and even new disciplines.

Aside from published sources, I have relied on collections of unpublished material pertaining to

Alan Turing, Warren McCulloch, John von Neumann and Norbert Wiener.1 My research builds on recent

scholarship in the history of computability theory (Sieg 1994, 1997; Hallett 1994; Shapiro 1995),

Turing’s early work on mechanical intelligence (Copeland 2000, Copeland and Proudfoot 1996), the

history of biophysical and connectionist modeling (Abraham 2001a, 2001b, 2002; Arbib 2000; Frank

1994; Smalheiser 2000), and the discovery of neural mechanisms (Craver and Darden 2001, Craver

2001). Another resource is the recently published interviews with some early connectionist modelers

(Anderson and Rosenfeld 1998). My research complements this recent work by showing how early

attempts at mathematical modeling of the brain, the development of new mathematical and modeling

tools, and new hypotheses about memory mechanisms came together to form a novel conceptual and

methodological framework for studying the functional organization of the brain.

In this historical part of the dissertation, I will cover the period going roughly from the mid-

1930s, when a number of scientists interested in computation and the brain began to meet one another, to

1946, when these scientists formed a small scientific community, whose members were sharing their

work and organizing conferences.

2.2 Background

McCulloch was an eclectic neurophysiologists and psychiatrist, whose main goal was to explain the mind

in terms of neural mechanisms. This was not a new project: for instance, an illustrious antecedent was

1 The Alan Mathison Turing Papers are at Archives of King’s College, Cambridge, UK. The Warren S. McCulloch

Papers are at the Library of the American Philosophical Society, Philadelphia, PA. The Papers of John von

Neumann are at the Library of Congress, Washington, DC. The Norbert Wiener Papers are at the Institute Archives

and Special Collections, MIT Libraries, Cambridge, Massachusetts, under MC 22. Whenever possible, I will

indicate the box and file folder (abbreviated ff.) of the unpublished documents I refer to.

16

Sigmund Freud’s unpublished Project for a Scientific Psychology (Freud 1895).2 Freud thought that

neuronal activity must embody ideas. Since the nineteenth century, neurophysiologists realized that nerve

fibers carry electrical pulses, and that these pulses can have either excitatory or inhibitory actions on other

nerve fibers. By the end of the nineteenth century, neurophysiologists reached a kernel of consensus that

the nervous system is made out of individual cells called neurons, which are connected together in vast

networks. The pulse trains on nerve fibers were largely interpreted as carrying meaningful messages, but

there was no detailed account of how these messages were processed by the brain. Freud’s theory was

that energy was discharged from neuron to neuron, and that a neuron’s energy level corresponded to the

activation of an idea in the mind.

McCulloch’s views about mind and brain originated during the 1920s and reached maturity at the

beginning of the 1940s, half a century after Freud’s. Unlike Freud, McCulloch did not primarily rely on

the energy levels and energy flow between neurons. Instead, McCulloch thought that the relevant entity

transmitted from neurons to neurons was “information.” McCulloch’s notion of “information” derived

from the power of formal logic to derive conclusions from premises. According to McCulloch, an

important aspect of mind was “formal,” and was constituted by inferences that can be modeled by a

logical calculus. Since logical calculi could be implemented in computing mechanisms, McCulloch

thought the brain must be a computing mechanism embodying a logical calculus that constituted the

formal aspect of the mind. McCulloch didn’t stop at his formulation of a general theory of the brain: he

spent a good portion of his time after 1943 devising testable hypotheses about specific neural mechanisms

and how they might explain some mental function. So, McCulloch was one of the originators not only of

computationalism as a doctrine, but also of the methodology of building models of neural or cognitive

mechanisms, based on the computationalist doctrine, to explain various aspects of the mind.

Understanding McCulloch's project is crucial to understanding both the cybernetic movement and the

subsequent history of AI and cognitive science.

2 For an account of Freud’s Project and its significance for Freud’s psychoanalysis, see Fancher 1973.

17

The difference between McCulloch and Freud’s projects was made possible by two main

historical developments: the rise of mathematical logic and the establishment of the all-or-none law of

neural activity.

In the years 1910-1013, Alfred North Whitehead and Bertrand Russell published their Principia

Mathematica. It contained a powerful formal logical system, whose purpose was to prove all

mathematical theorems on logical grounds. Whether Whitehead and Russell succeeded in deriving

mathematics from logic alone remained controversial, but the unquestionable deductive power of their

formal system popularized mathematical logic in both philosophy and mathematics. Mathematicians

developed and applied Whitehead and Russell’s techniques to study the foundations of mathematics,

whereas philosophers applied those techniques to problems in epistemology, philosophy of language, and

other areas.3

In subsequent work, Russell himself developed a view called logical atomism, according to which

ordinary physical objects should be understood as constructions made out of logical atoms (such as “red

here now”) by means of logical techniques. Also, according to Russell, our knowledge of ordinary

physical objects could be reduced by logical means to our knowledge of sense data, analogously to how

mathematical theorems could be derived from the logical system of the Principia (Russell 1914).4 The

most detailed and rigorous attempt to carry out this epistemological project was made by Rudolf Carnap

(Carnap 1928).5 We shall see that both the Principia and Russell’s epistemological project motivated

McCulloch’s project for reducing the mind to the brain.6

3 For more details on the Principia, see Irvine 2002.

4 For more details on Russell’s philosophy, see Irvine 2001.

5 Clark Glymour has gone as far as ascribing Carnap the first computational theory of mind:

The first explicitly computational theory of cognitive capacities is Rudolf Carnap’s Der Logische Aufbau

der Welt. Carnap’s book offered an account of how concepts of color, sound, place, and object could be

formed from elements consisting of gestalt experiences and a relation (“recollection of similarity”) between

such experiences. The theory was given as a logical construction, but also as what Carnap called a “fictive

procedure”. The procedural characterization is in fact a series of algorithms that take as input a finite list of

pairs of objects (the “elementary experiences”) such that there is a recollection of similarity between the

first and the second member of each pair. The book was of course written before there were computers or

programming languages, but it would nowadays be an undergraduate effort to put the whole thing into

LISP code (Glymour 1990, p. 67).

18

In the first quarter of the 20th century, much work in neurophysiology focused on the nature of the

impulses traveling through nerve fibers. In 1926, Edgar Adrian started publishing his groundbreaking

recordings from nerve fibers (Adrian 1926, Adrian and Zotterman 1926). Adrian’s work was reported in

newspapers in London and New York, and Adrian continued to extend his results and publish them

through the late 1920s. In 1930, A.V. Hill proposed Adrian for the Nobel Prize in physiology or

medicine, which was awarded to Adrian (shared with Charles Sherrington) in 1932. Adrian’s work was

being publicly recognized as the definitive experimental demonstration of the all-or-none law, namely

“that the intensities of sensation and response depend simply upon the number of nerve impulses which

travel to or from the nervous system, per unit of time.”7 We shall see that around 1929, the all-or-none

law would have an important impact on McCulloch’s thinking about brain and mind.

2.3 Logic, Epistemology, and the Brain

McCulloch’s interest in logic and epistemology goes as far back as his college years. In 1917, he was a

freshman at Haverford College, where he studied mathematics and medieval philosophy. He was fond of

recalling an exchange he had at Haverford with a philosophy teacher, Rufus Jones. McCulloch told his

teacher that he wanted to know, “What is a number, that a man may know it; and a man, that he may

know a number?” Jones replied: “Friend, thee will be busy as long as thee lives.” McCulloch described

his work of a lifetime as pursuing and accomplishing that project, and more generally of answering “the

general questions of how we can know anything at all.”8

See also Glymour, Ford, and Hayes 1995. Glymour’s point here may be justified conceptually, but there is little

evidence that Carnap’s Aufbau played a very important role in the history of computationalism.

6 Lettvin, a long-time friend of McCulloch and Pitts, wrote:

Strongly in the minds of both McCulloch and Pitts were the notions of Russell as contained in his essays on

mind, the notions of Peirce, and to a great extent, the notions of Whitehead, in particular as regards the

structure of mind and experience (Lettvin 1989a, p. 12).

7 Letter by A.V. Hill to the Nobel Committee, cited by Frank 1994, p. 209. For a detailed history of the

experimental demonstration of the all-or-none law, see Frank 1994.

8 Biographical Sketch of Warren S. McCulloch, ca. 1942. Warren S. McCulloch Papers, ff. Curriculum Vitae.

McCulloch 1961; McCulloch 1974, pp. 21-22. The episode involving his philosophy teacher is also cited by many

friends of McCulloch.

19

After serving in the Navy during World War I, in 1920 McCulloch transferred to Yale, where he

majored in philosophy and minored in psychology in 1921. Among other things, he recalled reading

Immanuel Kant’s Critique of Pure Reason, whose notion of synthetic a priori knowledge would be a

major influence on his thinking about the brain. He also studied Aristotle, the British Empiricists, Charles

Sanders Peirce, Georg Wilhelm Friedrich Hegel, and Whitehead and Russell’s Principia Mathematica.

He thought that by defining numbers as classes of classes and formally deriving mathematics from logic,

Whitehead and Russell had answered satisfactorily his first question, “What is a number, that a man may

know it?” He would devote his efforts to the other question.9

That other question was, what is a man, that he may know a number? McCulloch’s first step

towards an answer was a sort of mental atomism, according to which there must be atomic mental events

that carry truth values. More complex mental events can be logically constructed out of atomic ones. It is

not known exactly when or how McCulloch developed his mental atomism, but McCulloch’s

retrospective on his philosophical education suggests that even at this early stage of his life, he freely

interpreted previous philosophical work along mental atomistic lines:

I turned to Russell and Whitehead—(Principia Mathematica)—who in the calculus of atomic

propositions were seeking the least event that could be true and [sic] false. Leibnitz’s problem of

the petit perception had become the psychophysical problem of the just noticeable difference,

called JND, which has since been found in the middle range of perception to be roughly

proportional to the size of the stimulus.10

JND is the smallest difference between two stimuli that can be discriminated by a subject. McCulloch

continued his retrospective by arguing that JND entailed that there were atomic mental signals:

In searching for these unit signals of perception I came on the work of René Descartes… For him

the least true or false event would have been his postulated hydraulic pulse in a single tube, now

called an axon.11

These comments are revealing in several ways. They show McCulloch’s keen interest in philosophy as

well as his attempt to recruit past philosophical works to serve his own projects. They also exhibit

9 McCulloch 1974, p. 22. Lettvin 1989a, p. 11.

10 McCulloch 1974, pp. 24-25.

11 McCulloch 1974, p. 26.

20

McCulloch’s elliptical writing style and some bits of his personal jargon. He did not explain what he took

a “least true or false event” to be.

McCulloch said that in 1920, he tried to construct what he called “a logic for transitive verbs of

action and contra-transitive verbs of perception,” on which he continued to work until February 1923.12

By then, he was completing an M.A. in psychology at Columbia University, which he had started after his

B.A. At Columbia he became very interested in physiological psychology and studied mathematics,

physics, chemistry, and neuroanatomy.13 By the time he took his M.A. from Columbia, “he had become

convinced that to know how we think and know, he must understand the mechanism of the organ whereby

we think and know, namely the brain.”14 After that, he enrolled in medical school at Columbia, with the

goal of learning enough physiology to understand how brains work.15 Both as a student of medicine and

later as an intern, he mostly focused on neurology and psychiatry, with the hope of developing a theory of

neural function.

While pursuing his medical studies, and after he abandoned his project of a logic of verbs of

action and perception, McCulloch allegedly developed a psychological theory of mental atoms. He

postulated atomic mental events, which he called “psychons,” in analogy with atoms and genes:

My object, as a psychologist, was to invent a kind of least psychic event, or “psychon,” that

would have the following properties: First, it was to be so simple an event that it either happened

or else it did not happen. Second, it was to happen only if its bound cause had happened … that

is, it was to imply its temporal antecedent. Third, it was to propose this to subsequent psychons.

Fourth, these were to be compounded to produce the equivalents of more complicated

propositions concerning their antecedents.16

McCulloch said he tried to develop a propositional calculus of psychons. Unfortunately, the only known

records of this work are a few passages in later autobiographical essays by McCulloch himself.17 In the

absence of primary sources, it’s difficult to understand the exact nature of McCulloch’s early project. A

12 He did not say why he dropped the project, though he said he encountered “many pitfalls” (McCulloch 1974, pp.

28-29). Unfortunately, I found no record of this work.

13 McCulloch 1974, p. 30.

14 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.

15 McCulloch 1974, p. 30.

16 McCulloch 1961, p. 8.

17 On McCulloch’s early psychological theory, see McCulloch 1961, pp. 8-9; McCulloch 1965, pp. 392-393;

Abraham 2002, p. 7.

21

key point is that a psychon is “equivalent” to a proposition about its temporal antecedent. In more

modern terminology, McCulloch seemed to think that a psychon had a propositional content, which

contained information about that psychon’s cause. A second key point is that a psychon “proposes”

something to a subsequent psychon. This seems to mean that the content of psychons could be

transmitted from psychon to psychon, generating “the equivalents of” more complex propositions. These

themes would play an important role in McCulloch’s mature theory of the brain.

McCulloch did his internship in organic neurology under Foster Kennedy at Bellevue Hospital in

New York, where he finished in 1928.18 While working as an intern, he “was forever studying anything

that might lead me to a theory of nervous function.”19 He developed a long-term interest in closed loops

of activity in the nervous system, namely activity flowing through neurons arranged in closed circuits.

Since neural activity flowing in circles along close circuits could feed itself back onto the circuit, thereby

sustaining itself indefinitely, McCulloch called this process “reverberation.” At that time, there was no

evidence of closed anatomical loops within the central nervous system, although McCulloch attributed to

Ramón y Cayal the hypothesis that they existed.

The tremors of Parkinson’s disease, McCulloch thought, could be explained by closed loops of

activity connecting the spinal cord and the contracting muscles. With his fellow intern Samuel Wortis,

McCulloch discussed whether the loops that would explain Parkinson’s were a local “vicious circle”—

namely a closed loop involving only the spine and the muscles but not the brain—or the effect of a close

loop of activity in the central nervous system, which sent a cyclical signal to the region of the body

affected by the tremor. McCulloch and Wortis wondered whether other diseases, such as epilepsy, could

be explained by closed loops of neural activity. They did not consider that closed loops of activity could

be a normal feature of the nervous system, in part because their discussions were taking place before

Lawrence Kubie published the first theoretical paper postulating closed loops in the central nervous

18 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.

19 McCulloch 1974, p. 30.

22

system to explain memory (Kubie 1930).20 Later in his life, McCulloch would hypothesize closed loops

as explanations for many normal neural functions.

By the end of his internship at Bellevue in 1928, McCulloch “had become convinced that to

understand the workings of the nervous system he needed more physics and chemistry.”21 During the

following couple of years, McCulloch studied those subjects, as well as more mathematics, at New York

University. During the same period, he taught physiological psychology at Columbia Extension in

Brooklyn and he did research with Frank Pike in the Laboratory of Neurosurgery at Columbia and with

Wortis in the Laboratory of Experimental Neurology at Bellevue.22

In 1929, McCulloch met mathematician R. V. Hartley of Bell Labs. Hartley had defined, for

engineering purposes, a quantifiable notion of information divorced from meaning. In Hartley’s words,

his was “a definite quantitative measure of information based on physical considerations alone.”23

Hartley studied how much information can be transmitted using given codes as well as the effect of noise

on the transmission of information. As McCulloch knew, Hartley’s work was an important part of the

background against which Shannon later formulated his mathematical theory of communication (Shannon

1948; Shannon and Weaver 1949).24 This shows that early on, McCulloch was interested in engineering

problems of communication and acquainted with attempts to formulate a mathematical theory of

information.

In 1931, an otherwise unspecified “friend” of McCulloch’s translated to him and others a recently

published paper by Kurt Gödel “on the arithmetization of logic.”25 This shows that McCulloch was

keeping up with important results in the foundations of mathematics.

20 For an account of these events, see McCulloch 1974, pp. 30-31.

21 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.

22 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae.

23 Hartley 1929, p. 538.

24 McCulloch 1974, 32. For more on Hartley’s work and its relation to Shannon’s, see Aspray 1985, pp. 120-122.

25 McCulloch 1974, p. 32. The translated paper was probably Gödel’s famous paper on incompleteness (Gödel

1931), as indicated by a deleted phrase in a draft of McCulloch’s 1974 paper (Warren S. McCulloch Papers, Series

V Miscellaneous, Box 3).

23

McCulloch’s views about a calculus of psychons underwent an important transformation in 1929.

It occurred to him that the all-or-none electric impulses transmitted by each neuron to its neighbors might

correspond to the mental atoms of his psychological theory, where the relations of excitation and

inhibition between neurons would perform logical operations upon electrical signals corresponding to

inferences of his propositional calculus of psychons. His psychological theory of mental atoms turned

into a theory of “information flowing through ranks of neurons.”26

This was McCulloch’s first attempt “to apply Boolean algebra to the behavior of nervous nets.”27

The brain would embody a logical calculus like that of Whitehead and Russell, which would account for

how humans could perceive objects on the basis of sensory signals and how humans could do

mathematics and abstract thinking. This was the beginning of McCulloch’s search for the “logic of the

nervous system,” on which he kept working until his death. A major difficulty to the formulation of his

logical calculus was the treatment of closed loops of neural activity. McCulloch was trying to describe

the causal structure of neural events by assigning temporal indexes to them. But he thought a close loop

meant that one event was its own ancestor, which did not make sense to him. He wanted “to close the

loop” between chains of neuronal events, but did not know how to conceive of the events in the close

loops. He would not find a solution to this difficulty until he met Walter Pitts in the early 1940s.28

In 1932-3, McCulloch was at Rockland State Hospital, still in New York City, to earn money as a

psychiatrist. There, he met the German psychiatrist Eilhard von Domarus, who was earning his

philosophy Ph.D. at Yale under Filmer Northrop, with a dissertation On the Philosophic Foundation of

Psychology and Psychiatry. Von Domarus’s dissertation interpreted psychoses, such as schizophrenia, as

26 McCulloch 1974, p. 32.

27 Biographical Sketch of Warren S. McCulloch, ca. 1948. Warren S. McCulloch Papers, ff. Curriculum Vitae. The

same Biographical Sketch also says that this was the time when McCulloch “attempted to make sense of the logic of

transitive ver[b]s,” which conflicts with what he wrote in his later autobiographical essays. Given the lack of

primary sources and given McCulloch’s inconsistencies in his writings, it is hard to date his early work with

certainty. But in spite of some inconsistencies with dates, in all his relevant writings McCulloch emphasized his

early interest in logic and his attempts to apply logic to psychology and later to a theory of the brain. It is thus hard

to believe Lettvin when he wrote that until McCulloch worked with Pitts in the early 1940s, McCulloch had not

applied “Boolean logic” to the working of the brain (Lettvin 1989a, p. 12). Lettvin gave no evidence for this claim.

Since Lettvin met McCulloch only around 1940, Lettvin may never have discovered McCulloch’s early efforts in

this direction.

28 For an account of these events, see McCulloch 1961; McCulloch 1974, pp. 30-32; Arbib 2000, p. 213.

24

logical disturbances of thought. But Von Domarus could not write well in English, so McCulloch helped

him write his dissertation. Years later, in commenting on von Domarus’s dissertation, McCulloch found

that “no other text so clearly sets forth the notions needed for an understanding of psychology, psychiatry

and finite automata.”29

2.4 Strychnine Neuronography and the Functional Organization of the Brain

Until 1934, McCulloch was an ambitious psychiatrist with original ideas but little scientific track record.

He had only four publications in scientific journals. In 1934, he moved to Yale to work in Joannes

Dusser de Barenne’s Laboratory of Neurophysiology. Dusser de Barenne was a distinguished Dutch

neurophysiologist who had moved from Holland to Yale in 1930.30 McCulloch worked at Yale until

shortly after Dusser de Barenne’s death in 1940. McCulloch’s work during those years launched his

academic career.31

With Dusser de Barenne, McCulloch worked mostly on mapping the connections between brain

areas. To discover those connections, Dusser de Barenne had developed the method of strychnine

neuronography. When strychnine was applied to one brain area, it caused neurons to fire. The pulses

from those neurons would activate whichever areas were connected to the first area. By applying

strychnine to a cortical area and recording the activity of other brain areas, it was thus possible to map the

projections of any area of the cortex. Dusser de Barenne and McCulloch mapped cortico-cortical

connections as well as connections between cortical areas and other areas of the monkey brain.

McCulloch continued working with strychnine neuronography after leaving Yale for Chicago, where he

worked with Percival Bailey, Gerhard von Bonin, and others. He published over forty papers on the

subject between 1934 and 1944. In 1944, McCulloch published two review articles on cortical

connections, one in the journal Physiology (McCulloch 1944a), the other in a reference volume on The

29 Thayer 1967, p. 350, cited by Heims 1991, p. 133; see also McCulloch 1974, pp. 32-3.

30 McCulloch 1940, p. 271.

31 McCulloch’s many publications on neurophysiology are reprinted in his Collected Works (McCulloch 1989).

25

Precentral Motor Cortex (McCulloch 1944b). His work using strychnine neuronography established him

as a leading expert on what he called “the functional organization of the brain.”32

McCulloch’s work in Dusser de Barenne’s lab explicitly connected him to an intellectual lineage

in neurophysiology that goes from Wilhelm von Helmholtz to Rudolf Magnus to Dusser de Barenne.

These authors were concerned with the physiological foundations of perception and knowledge, including

the idea that Kant’s synthetic a priori knowledge is grounded in the anatomy and physiology of the brain.

This idea was well expressed in Magnus’s lecture “The Physiological A Priori” (Magnus 1930), which

McCulloch knew and cited. Dusser de Barenne consciously inherited the quest for the physiological a

priori from his mentor Magnus and transmitted it to his pupil McCulloch.33 McCulloch saw himself as

continuing the tradition from Kant to Dusser de Barenne, and would refer to his theory of the brain as

solving the problem of the physiological a priori. Partly because of this, he called his intellectual

enterprise experimental epistemologyin the 1950s, a sign reading “experimental epistemology” hung

from his MIT lab’s door.34

McCulloch said that the work with Dusser de Barenne was important to him, because it made him

deal with brains and their activity: “For me it proved that brains do not secrete thought as the liver

secretes bile, but that they compute thoughts the way computing machines calculate numbers.”35

Unfortunately, he did not explain in what sense or by what means this neurophysiological work “proved”

that brains compute thoughts.

Some clue as to the relationship between McCulloch’s work with Dusser de Barenne and

McCullolch’s view that the brain performs computations was offered by Jerome Lettvin. Lettvin—one of

McCulloch’s life-long collaborators and friends—offered an explanation that he is likely to have heard

32 For more on Dusser de Barenne and McCulloch’s neurophysiological work, see Gershwind 1989; Abraham 2002,

pp. 8-11.

33 According to McCulloch, Dusser de Barenne had worked “intimately” with Magnus (McCulloch 1940, p. 270;

McCulloch 1974, p. 22).

34 Interview with Lettvin, in Anderson and Rosenfeld 1998. McCulloch put it as follows:

The main theme of the work of my group in neurophysiology in the Research Laboratory of Electronics at

the Massachusetts Institute of Technology has been in this tradition, namely, experimental epistemology,

attempting to understand the physiological foundation of perception (McCulloch 1974, pp. 22-23).

35 McCulloch 1974, p. 33.

26

from McCulloch. According to Lettvin, two aspects of McCulloch’s neurophysiological work were

especially relevant. First, there was the observation of nerve specificity: stimulating specific nerves or

specific brain areas would lead to excitation (or inhibition) of very specific other neural areas, or give rise

to specific movements, or specific sensations. This suggested that there were pre-existing paths between

specific portions of the nervous system, carrying specific pulses from certain areas to others, and that

those pulses gave rise to “all the kinds of perception, thinking, and memory that we enjoy.”36 Second,

synaptic action—i.e. the action occurring between neurons—was irreversible, occurring only in one

preferred direction and not in the reverse direction. Pulses could travel through the pre-existing paths

only in one direction. According to Lettvin:

It would be impossible to devise a logical system in which the connections were reversible; that

is, active informationally in both directions. So, to McCulloch’s mind, the existence of a single

direction in the nervous system for information reinforced the idea of an essentially logical

device.37

McCulloch was presumably interpreting his neurophysiological observations on the basis of his pre-

existing assumptions about information flow through ranks of neurons. According to McCulloch, by the

time he went to work with Dusser de Barenne, McCulloch had already reached the conclusion that neural

activity can be modeled by logic. As much as his neurophysiological observations were compatible with

such a view, McCulloch’s claim that they “proved” it seems to be an overstatement.

While working as an experimental neurophysiologist at Yale, McCulloch established connections

with colleagues in the field. An especially important one was with a young Mexican researcher, Arturo

Rosenblueth, who was working with Walter Cannon at Harvard Medical School on homeostasis and other

topics. At least by 1938, McCulloch and Rosenblueth knew each other and had initiated a dialogue over

experimental methods and results in neurophysiology.38 In the summer of 1941, McCulloch visited

36 Lettvin 1989a, p. 14.

37 Ibid.

38 Letter by McCulloch to J. F. Tönnies, dated April 11, 1938. Warren S. McCulloch Papers, ff. Tönnies. Letter by

McCulloch to Rosenblueth, dated December 22, 1939. Warren S. McCulloch Papers, ff. Rosenblueth.

27

Rosenblueth’s lab and they ran a few experiments together.39 By then, they had also started discussing

more theoretical topics, such as von Domarus’s dissertation on the foundations of psychiatry.40

McCulloch and Rosenblueth developed a friendship that lasted for decades.

Rosenblueth shared McCulloch’s dissatisfaction with the current lack of theory in

neurophysiology, which he expressed to McCulloch as follows:

It is always difficult to strike the right balance between experiment and hypothesis, but it seems

to me, in the main, that a good many of our colleagues—perhaps even ourselves—do not do

enough thinking about the large number of experiments carried out. In other words, I found our

discussions very stimulating.41

Rosenblueth’s laudatory reference to his conversations with McCulloch right after his complaint about the

lack of hypotheses in neurophysiology suggests that in conversation, McCulloch had manifested a more

theoretical bent than many of their colleagues, which Rosenblueth appreciated. This would not be

surprising, for McCulloch cared much about theory and throughout his life, he manifested and publicly

defended a tendency to formulate hypotheses and theories even in the absence of data to test them.

At Yale, McCulloch also attended a philosophical seminar for research scientists organized by

Filmer Northrop, who was von Domarus’s old advisor. At one of those seminars, Frederic Fitch, a

distinguished logician from Yale’s Philosophy Department, presented the theory of deduction of

Principia Mathematica. McCulloch also attended advanced lectures by Fitch on logical operators and

urged Fitch to work on the logic of neural nets.42

While McCulloch was at Yale, he became acquainted with the work of J. H. Woodger (1937),

who advocated the axiomatic method in biology. In a letter to a colleague written in 1943, McCulloch

wrote:

I personally became acquainted with Woodger because the great interest of the biologists in Yale

had led to his coming thither to tackle some of their problems. When he finally departed, it was

not because they were not convinced of the value of his attempt but because he was convinced

39 Two letters by Rosenblueth to McCulloch, dated June 21 and September 5, 1941. Warren S. McCulloch Papers,

ff. Rosenblueth.

40 Letter by McCulloch to Rosenblueth, dated May 1, 1941. Letter by Rosenblueth to McCulloch, dated December

3, 1941. McCulloch papers, ff. Rosenblueth.

41 Letter by Rosenblueth to McCulloch, dated September 5, 1941. Warren S. McCulloch Papers, ff Rosenblueth.

42 Heims 1991, p. 34ff.

28

that the ambiguity of their statements prevented logical formulation. It was to discussions with

him and with Fitch that I owe much of my persistence in attempting a logical formulation of

neuronal activity. Until that time I had merely used the nomenclature of the Principia

Mathematica to keep track of the activity of neuronal nets.43

In the same letter, McCulloch suggested that it was only around this time that he started seeing his theory

of the brain as a “theory of knowledge”:

[T]he theory … began originally as a mere calculus for keeping track of observed realities. It was

at work for seven years before it dawned on me that it had those logical implications which

became apparent when one introduces them into the grandest of all feed-back systems, which

runs from the scientist by manipulations through the objects of this world, back to the scientist—

so producing in him what we call theories and in the great world are little artifacts.44

McCulloch had known Northrop, another member of Yale’s Philosophy Department, since 1923,

and continued to be in contact with him through the 1930s. Much of Northrop’s philosophy was about

science and scientific methodology. Northrop believed that scientific disciplines reach maturity when

they start employing logic and mathematics in formulating rigorous, axiomatic theories:

The history of science shows that any empirical science in its normal healthy development begins

with a more purely inductive emphasis, in which the empirical data of its subject matter are

systematically gathered, and then comes to maturity with deductively-formulated theory in which

formal logic and mathematics play a most significant part (Northrop 1940, p. 128; cited by

Abraham 2002, p. 6).

Northrop argued that biology was finally reaching its maturity with the work of Woodger (1937) and

Nicolas Rashevsky (1938), who had imported formalisms and techniques from mathematical physics into

biology.45

While McCulloch was working in Dusser de Barenne’s lab at Yale, Alan Turing published his

famous paper on computability (1936-7), where he drew a clear and rigorous connection between

computing, logic, and machines. By the early 1940s, McCulloch had read Turing’s paper. In 1948, in a

public discussion of his theory of the brain at the Hixon Symposium, McCulloch declared that it was the

reading of Turing’s paper that led him in the “right direction.”46

43 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.

44 Ibid.

45 For a more detailed account of Northrop’s philosophy of science, see Abraham 2002, pp. 6-7.

46 Von Neumann 1951, p. 33.

29

3 TOWARDS A THEORY OF THE BRAIN, 1936-1942

3.1 What Computing Mechanisms Can Do1

The modern mathematical notion of computation, which was developed by Alan Turing in his 1936-7

paper and reviewed in Chapter 1, played a crucial role in the history of computationalism. This section

concerns Turing’s use of “computable” and “machine” in his logic papers, his version of the Church-

Turing Thesis (CT, i.e. that every effectively calculable function is computable by a Turing Machine),

and why his early work on computability was not initially read as an attempt to establish, or even to

imply, that the mind is a machine.

Today, both the term “computable” and formulations of CT are utilized in many contexts,

including discussions of the nature of mental, neural, or physical processes. Some of these uses are

discussed at length in the Chapter 7.2 None of these uses existed at Turing’s time, and their superposition

onto Turing’s words yields untenable results. For instance, according to a popular view, Turing’s

argument for CT was already addressing the problem of how to mechanize the human mind, while the

strength of CTperhaps after some years of experience with computing machineseventually convinced

Turing that thinking could be reproduced by a computer.3

This reading makes Turing appear incoherent. It conflicts with the fact that he, who reiterated CT

every time he talked about machine intelligence, never said that the mechanizability of the mind was a

consequence of CT. Quite the opposite: in defending his view that machines could think, he felt the need

1 This section is adapted from a section of larger paper, devoted to Turing’s ideas on logical proofs and machine

intelligence (Piccinini 2003a).

2 For a survey of different uses see Odifreddi 1989, I.8. (Odifreddi writes “recursive” instead of “computable.”)

3 See e.g. Hodges 1983, esp. p. 108; also Hodges 1988, 1997; Leiber 1991, pp. 57, 100; Shanker 1995, pp. 64, 73;

Webb 1980, p. 220. Turing himself is alleged to have argued, in his 1947 “Lecture to the London Mathematical

Society,” that “the Mechanist Thesis ... is in fact entailed by his 1936 development of CT” (Shanker 1987, pp. 615,

625). Since Shanker neither says what the Mechanist Thesis is, nor provides textual evidence from Turing’s lecture,

it is difficult to evaluate his claim. If the Mechanist Thesis holds that the mind is a machine or can be reproduced by

a machine, we’ll see that Shanker is mistaken. However, some authorsother than Turingdo believe CT to entail

that the human mind is mechanizable (e.g., Dennett 1978a, p. 83, Webb 1980, p. 9). Their view is discussed in the

Chapter 7.

30

to respond to many objections.4 If one wants to understand the development of Turing’s ideas on

mechanical intelligence, his logical work on computability must be understood within its context. The

context would change in the 1940s with the publication of McCulloch and Pitts’s computational theory of

the brain—which will be discussed in the next chapter—and the subsequent rise of computationalism.

This change in context explains why in the second half of the 20th century, many found it so natural to

read Turing’s logical work as defending a form of computationalism.

But in the 1930s there were no working digital computers, nor was cognitive science on the

horizon. There did exist some quite sophisticated computing machines, which at the time were called

differential analyzers and would later be called analog computers. Differential analyzers had mechanical

gears that obeyed certain types of differential equations. By setting up the gears in appropriate ways,

differential analyzers could solve certain systems of differential equations. At least as early as 1937,

Turing knew about the Manchester differential analyzer, which was devoted to the prediction of tides, and

planned to use a version of it to find values of the Riemann zeta function.5

In the 1930s and up through the 1940s, the term “computer” was used to refer to people

reckoning with paper, pencil, and perhaps a mechanical calculator. Given the need for laborious

calculations in industry and government, skilled individuals, usually young women, were hired as

“computers.” In this context, a “computation” was something done by a computing human.

The origins of “Computable Numbers” can be traced to 1935, when Turing graduated in

mathematics from King’s College, Cambridge, and became a fellow of King’s. In that year, he attended

an advanced course on Foundations of Mathematics by topologist Max Newman. Newman, who became

Turing’s lifelong colleague, collaborator, and good friend, witnessed the development of Turing’s work

4 Indeed, in his most famous paper on machine intelligence, Turing admitted: “I have no very convincing arguments

of a positive nature to support my views. If I had I should not have taken such pains to point out the fallacies in

contrary views” (Turing 1950, p. 454).

5 Hodges 1983, pp. 141, 155-8.

31

on computability, shared his interest in the foundations of mathematics, and read and commented on

Turing’s typescript before anyone else.6

In his biography of Turing as a Fellow of the Royal Society, Newman links “Computable

Numbers” to the attempt to prove rigorously that the decision problem for first order logic, formulated by

David Hilbert within his program of formalizing mathematical reasoning (Hilbert and Ackermann 1928),

is unsolvable in an absolute sense. “[T]he breaking down of the Hilbert programme,” said Newman, was

“the application [Turing] had principally in mind.”7 In order to show that there is no effective

procedureor “decision process”solving the decision problem, Turing needed:

… to give a definition of ‘decision process’ sufficiently exact to form the basis of a mathematical

proof of impossibility. To the question ‘What is a “mechanical” process?’ Turing returned the

characteristic answer ‘Something that can be done by a machine,’ and embarked in the highly

congenial task of analyzing the general notion of a computing machine.8

Turing was trying to give a precise and adequate definition of the intuitive notion of effective procedure,

as mathematicians understood it, in order to show that no effective procedure could decide first order

logical provability. When he talked about computations, Turing meant sequences of operations on

symbols (mathematical or logical), performed either by humans or by mechanical devices according to a

finite number of ruleswhich required no intuition or invention or guessworkand whose execution

always produced the correct solution.9 For Turing, the term “computation” by no means referred to all

that mathematicians, human minds, or machines could do.

6 Hodges 1983, pp. 90-110.

7 Newman 1955, p. 258.

8 Ibid.

9 See his argument for the adequacy of his definition of computation in Turing, 1936-7, pp. 135-8. The last

qualificationabout the computation being guaranteed to generate the correct solutionwas dropped after

“Computable Numbers.” In different writings, ranging from technical papers to popular expositions, Turing used

many different terms to explicate the intuitive concept of effective procedure: “computable” as “calculable by finite

means” (1936-7), “effectively calculable” (1936-7, pp. 117, 148; 1937, p. 153), “effectively calculable” as a

function whose “values can be found by a purely mechanical process” (1939, p. 160), “problems which can be

solved by human clerical labour, working to fixed rules, and without understanding” (1945, pp. 38-9), “machine

processes and rule of thumb processes are synonymous” (1947, p. 112), “‘rule of thumb’ or ‘purely mechanical’”

(1948, p. 7), “definite rule of thumb process which could have been done by a human operator working in a

disciplined but unintelligent manner” (1951, p. 1), “calculation” to be done according to instructions explained

“quite unambiguously in English, with the aid of mathematical symbols if required” (1953, p. 289).

32

Turing rigorously defined “effectively calculable” with his famous machines: a procedure was

effective if and only if a Turing Machine could carry it out. “Machine” requires a gloss. Given the task

of “Computable Numbers,” viz. establishing a limitation to what could be achieved in mathematics by

effective methods of proof, it is clear that Turing Machines represented (at the least) computational

abilities of human beings. As a matter of fact, the steps these machines carried out were determined by a

list of instructions, which must be understandable unambiguously by human beings.

But Turing’s machines were not portrayed as understanding instructionslet alone intelligent.

Even if they were anthropomorphically described as “scanning” the tape, “seeing symbols,” having

“memory” or “mental states,” etc., Turing introduced all these terms in quotation marks, presumably to

underline their metaphorical use.10 If one thinks that carrying out a genuine, “meaningful”

computationas opposed to a “meaningless” physical processpresupposes understanding the

instructions, one should conclude that only humans carry out genuine computations. Turing Machines, in

so far as they computed, were abstract representations of idealized human beings. These considerations,

among others, led some authors to a restrictive interpretation: Turing’s theory bears on computability by

humans not by machines, and Turing Machines are “humans who calculate.”11

This interpretation is at odds with Turing’s use of “computation” and “machine,” and with his

depiction of his work. Turing never said his machines should be regarded as idealized human

beingsnor anything similar. We saw that, for him, a computation was a type of physical manipulation

of symbols. His machines were introduced to define rigorously this process of manipulation for

mathematical purposes. As Turing used the term, machines were idealized mechanical devices; they

10 Turing 1936-7, pp. 117-8.

11 This widely cited phrase is in Wittgenstein 1980, sec. 1096. Wittgenstein knew Turing, who in 1939 attended

Wittgenstein’s course on Foundations of Mathematics. Wittgenstein’s lectures, including his dialogues with Turing,

are in Wittgenstein 1976. Discussions of their different points of views can be found in Shanker 1987; Proudfoot

and Copeland 1994. Gandy is more explicit than Wittgenstein: “Turing’s analysis makes no reference whatsoever to

calculating machines. Turing machines appear as a result, as a codification, of his analysis of calculations by

humans” (Gandy 1988, p. 83-4). Sieg quotes and endorses Gandy’s statement (Sieg 1994, p. 92; see also Sieg 1997,

p. 171). Along similar lines is Copeland, 2000, pp. 10ff. According to Gandy and Sieg, “computability by a

machine” is first explicitly analyzed in Gandy, 1980. In the present chapter I am only concerned with the

historiographical merits of the Gandy-Sieg view, and not with its philosophical justification. The latter issue is

addressed in Chapter 7.

33

could be studied mathematically because their behavior was precisely defined in terms of discrete,

effective steps.

There is evidence that Turing, in 1935, talked about building a physical realization of his

universal machine.12 Twelve years later, to an audience of mathematicians, he cited “Computable

Numbers” as containing a universal digital computer’s design and the theory establishing the limitations

of the new computing machines:

Some years ago I was researching on what might now be described as an investigation of the

theoretical possibilities and limitations of digital computing machines. I considered a type of

machine which had a central mechanism, and an infinite memory which was contained on an

infinite tape. This type of machine appeared to be sufficiently general. One of my conclusions

was that the idea of a ‘rule of thumb’ process and a ‘machine process’ were synonymous …

Machines such as the ACE [Automatic Computing Engine] may be regarded as practical versions

of this same type of machine.13

Therefore, a machine, when Turing talked about logic, was not (only) a mathematical

representation of a computing human, but literally an idealized mechanical device, which had a

12 Newman 1954; Turing 1959, p. 49. Moreover, in 1936 Turing wrote a précis of “Computable Numbers” for the

French Comptes Rendues, containing a succinct description of his theory. The definition of “computable” is given

directly in terms of machines, and the main result is appropriately stated in terms of machines:

On peut appeler ‘computable’ les nombres dont les décimales se laissent écrire par une machine . . . On peut

démontrer qu’il n’y a aucun procédé général pour décider si une machine M n’écrit jamais le symbole 0

(Turing, 1936).

The quote translates as follows: We call “computable” the numbers whose decimals can be written by a machine …

We demonstrate that there is no general procedure for deciding whether a machine M will never write the symbol 0.

Human beings are not mentioned.

13 Turing 1947, pp. 106-7. See also ibid., p. 93. Also, Turing machines “are chiefly of interest when we wish to

consider what a machine could in principle be designed to do” (Turing 1948, p. 6). In this latter paper, far from

describing Turing machines as being humans who calculate, Turing described human beings as being universal

digital computers:

It is possible to produce the effect of a computing machine by writing down a set of rules of procedure and

asking a man to carry them out. Such a combination of a man with written instructions will be called a ‘Paper

Machine.’ A man provided with paper, pencil, and rubber, and subject to strict discipline, is in effect a

universal machine (Turing 1948, p. 9).

Before the actual construction of the ACE, “paper machines” were the only universal machines available, and were

used to test instruction tables designed for the ACE (Hodges 1983, chapt. 6). Finally:

A digital computer is a universal machine in the sense that it can be made to replace . . . any rival design of

calculating machine, that is to say any machine into which one can feed data and which will later print out

results (Turing, ‘Can digital computers think?’ Typescript of talk broadcast in BBC Third Programme 15 May

1951, AMT B.5, Contemporary Scientific Archives Centre, King’s College Library, Cambridge, p. 2).

Here, Turing formulated CT with respect to all calculating machines, without distinguishing between analog and

digital computers. This fits well with other remarks by Turing, which assert that any function computable by analog

machines could also be computed by digital machines (Turing 1950, pp. 451-2). And it strongly suggests that, for

him, any device mathematically defined as giving the values of a non-computable function, that is, a function no

Turing machine could computelike the “oracle” in Turing 1939, pp. 166-7could not be physically constructed.

34

potentially infinite tape and never broke down. Furthermore, he thought his machines could compute any

function computable by machines. This is not to say that, for Turing, every physical system was a

computing machine or could be mimicked by computing machines. The outcome of a random process,

for instance, could not be replicated by any Turing Machine, but only by a machine containing a “random

element.”14

Such was the scope of CT, the thesis that the numbers computable by Turing Machines “include

all numbers which could naturally be regarded as computable.”15 To establish CT, Turing compared “a

man in the process of computing . . . to a machine.”16 He based his argument on limitations affecting

human memory and perception during the process of calculation. At the beginning of “Computable

Numbers,” one reads that “the justification [for CT] lies in the fact that the human memory is necessarily

limited.”17 In the argument, Turing used human sensory limitations to justify his restriction to a finite

number of primitive symbols, as well as human memory limitations to justify his restriction to a finite

number of “states of mind.”18

Turing’s contention was that the operations of a Turing machine “include all those which are used

in the computation of a number” by a human being.19 Since the notion of the human process of

computing, like the notion of effectively calculable, was an intuitive one, Turing asserted that “all

arguments which can be given [for CT] are bound to be, fundamentally, appeals to intuition, and for this

reason rather unsatisfactory mathematically.”20 In other words, CT was not a mathematical theorem.21

From “Computable Numbers,” Turing extracted the moral that effective procedures, “rule of

thumb processes,” or instructions explained “quite unambiguously in English,” could be carried out by his

machines. This applied not only to procedures operating on mathematical symbols, but to any symbolic

14 Turing 1948, p. 9; 1950, p. 438.

15 Turing 1936-7, p. 116.

16 Ibid., p. 117.

17 Ibid., p. 117.

18 Turing 1936-7, pp. 135-6.

19 Ibid., p. 118.

20 Ibid., p. 135.

21 CT is usually regarded as an unprovable thesis for which there is compelling evidence. The issue of the

provability of CT is discussed at length in Chapter 7.

35

procedure so long as it was effective. A universal machine, if provided with the appropriate instructions,

could carry out all such processes. This was a powerful thesis, but very different from the thesis that

“thinking is an effective procedure.”22 In “Computable Numbers” Turing did not argue, nor did he have

reasons to imply from CT, that all operations of the human mind could be performed by a Turing

Machine.

After Turing published his paper, a number of results were quickly established to connect his

work to other proposed formal definitions of the effectively calculable functions, such as general

recursiveness (Gödel 1934) and λ-definability (Church 1932, Kleene 1935). It was shown that all these

notions were extensionally equivalent in the sense that any function that fell under any one of these

formal notions fell under all of them. Mathematicians took this as further evidence that the informal

notion of effective calculability had been captured.

In the 1930s and 1940s, Turing’s professionally closest colleagues read his paper as providing a

general theory of computability, establishing what could and could not be computednot only by

humans, but also by mechanical devices. This is how McCulloch and John von Neumann, among others,

read Turing’s work.23

At the time he published “Computable Numbers,” Turing moved to Princeton to work on a Ph.D.

dissertation under Alonzo Church. At Princeton, Turing met von Neumann, who was a member of the

Institute for Advanced Studies. Von Neumann was a Hungarian-born mathematician who had worked in

logic and foundations of mathematics but was then working mostly in mathematical physics. In 1938,

von Neumann invited Turing to become his assistant. Turing declined and went back to England in 1939.

22 According to Shanker, this was Turing’s “basic idea” (1995, p. 55). But Turing never made such a claim.

23 E.g., see Church 1937; 1956, p. 52, n.119; Watson 1938, p. 448ff; Newman 1955, p. 258; Kleene said: “Turing’s

formulation comprises the functions computable by machines” (1938, p. 150). When von Neumann placed

“Computable Numbers” at the foundations of the theory of finite automata, he introduced the problem addressed by

Turing as that of giving “a general definition of what is meant by a computing automaton” (von Neumann 1951, p.

313). Most logic textbooks introduce Turing machines without qualifying “machine,” the way Turing did. More

recently, doubts have been raised about the generality of Turing’s analysis of computability by machines (e.g., by

Siegelmann 1999). These doubts are discussed in Chapter 7.

36

3.2 Teleological Mechanisms

When he met Turing, von Neumann already knew Norbert Wiener. Wiener was an MIT mathematician

who shared von Neumann’s background in logic and current mathematical interests. Wiener was also

trained in philosophy, in which he had published several papers in the 1910s.24 Wiener and von Neumann

met around 1933 and became friends, beginning a scientific dialogue that lasted many years.25 Von

Neumann sometimes referred to their meetings as “mathematical conversations” to which he looked

forward.26

Wiener was interested in designing and building computing mechanisms to help solve problems

in mathematical physics. His correspondence about mechanical computing with Vannevar Bush, the

main designer of the differential analyzer and other analog computing mechanisms, stretches as far back

as 1925.27 At least since the mid 1930s, Wiener became involved in the design of analog computing

devices. According to Bush, Wiener made the original suggestion for the design of an analog computing

mechanism called the optical integraph, and was an expert on analog computing in general.28 In 1935,

Wiener proposed to Bush “a set up of an electrical simultaneous machine,” which was another analog

machine. Wiener briefly described the machine and its proposed use. In the same letter, he also

commented on a paper by Bush on some other analog machine, which might have been the differential

analyzer.29

In 1940, Bush was Chairman of the National Defense Research Committee of the Council of

National Defense. World War II had started and the US would soon enter it. The National Defense

Research Committee was in charge of recruiting talented scientists and assigning them to “national

defense” projects. Wiener proposed Bush a new design for a computing machine for solving boundary

24 Reprinted in Wiener 1976.

25 Letter by von Neumann to Wiener, dated November 26, 1933. Norbert Wiener Papers, Box 2, ff. 38. Letter by

von Neumann to Wiener, dated March 9, 1953. Norbert Wiener Papers, Box 11, ff. 166. For a comprehensive study

of the relationship between Wiener and von Neumann, see Heims 1980.

26 Letter by von Neumann to Wiener, dated April 9, 1937. Norbert Wiener Papers, Box 3, ff. 47.

27 Letter by Wiener to Bush, dated August 18, 1925. Norbert Wiener Papers, Box 2, ff. 27.

28 Letter by Bush to Ku, dated May 26, 1935. Norbert Wiener Papers, Box 3, ff. 42.

29 Letter by Wiener to Bush, dated September 22, 1935. Norbert Wiener Papers, Box 3, ff. 43.

37

value problems in partial differential equations. Unlike analog computing machines, where the values of

differential equations were represented by continuously varying physical variables, the method proposed

by Wiener consisted in replacing a differential equation with a difference equation asymptotically

equivalent to it, and using the difference equation to generate successive representations of the states of

the system using the “binary system” (i.e., binary notation).30 Wiener’s proposed machine was thus of a

kind that a few years later would be called digital rather than analog.

Bush responded to Wiener’s proposal with interest and asked for clarifications on the design and

range of applicability of the machine.31 Wiener convinced Bush that his proposed machine would be

“valuable” and “could be successfully constructed,” and Bush seriously considered whether to assign

funds for its construction.32 He ultimately declined to fund the project, because Wiener’s machine would

yield mostly long-term advantages to national defense, and the researchers who were qualified to work on

developing Wiener’s machine were needed for more urgent “defense research matters.”33

Wiener also had a long-standing interest in biology. In the late 1930s he met Arturo Rosenblueth

and started discussing with him the possibility of formulating a mathematical characterization of the

possible behaviors of organisms, analogously to how engineers described the possible behaviors of

machines. Wiener explained his project in a letter to an old acquaintance, the British biologist J. B. S.

Haldane:

I am writing to you … [about] some biological work which I am carrying out together with

Arturo Rosenblueth… Fundamentally the matter is this: Behaviorism as we all know is an

established method of biological and psychological study but I have nowhere seen an adequate

attempt to analyze the intrinsic possibilities of types of behavior. This has become necessary to

me in connection with the design of apparatus to accomplish specific purposes in the way of the

repetition and modification of time patterns. … [T]he problem of examining the behavior of an

instrument from this point of view is fundamental in communication engineering and in related

fields where we often have to specify what the apparatus between four terminals in a box is to do

for we take up the actual constitution of the apparatus in the box. We have found this method of

examining possibilities of time pattern behavior, quite independently of their realization, a most

useful way of preparing for their later realization. Now this has left us with a large amount of

30 Letters by Wiener to Bush, dated September 20 and 23, 1940. Norbert Wiener Papers, Box 4, ff. 58.

31 Letters by Bush to Wiener, dated September 24 and 25, 1940. Norbert Wiener Papers, Box 4, ff. 58.

32 Letters by Bush to Wiener, dated October 7 and 19, and December 31, 1940. Norbert Wiener Papers, Box 4, ff

58.

33 Letter by Bush to Wiener, dated December 31, 1940. Norbert Wiener Papers, Box 4, ff 58.

38

information on a priori possible types of behavior which we find most useful in discussing the

normal action and the disorders of the nervous system.34

Perhaps because McCulloch was friends with Rosenblueth, McCulloch found out about Wiener’s

work on control engineering and his interest in biology. Rosenblueth arranged a meeting between

McCulloch and Wiener, which according to McCulloch occurred during the spring of 1940 or 1941. As

McCulloch recalled it, he was:

… amazed at Norbert [Wiener]’s exact knowledge, pointed questions and clear thinking in

neurophysiology. He talked also about various kinds of computation and was happy with my

notion of brains as, to a first guess, digital computers, with the possibility that it was the temporal

succession of impulses that might constitute the signal proper.35

In using the term “digital computers,” here McCulloch was being a bit anachronistic. In 1941 there were

no modern digital computers in operation, and the term “digital computer” had not been used yet.

Nevertheless, it is likely that McCulloch explained to Wiener his ideas about information flow through

ranks of neurons in accordance with Boolean algebra.36

Due to his expertise in computing mechanisms, in 1941 Wiener was appointed as a consultant to

the government on machine computation. He began a strenuous research program on fire control for

antiaircraft artillery with a young Research Associate at MIT, Julian Bigelow.37 Bigelow had graduated

in electrical engineering from MIT in 1939, and then worked as an electronics engineer at IBM’s Endicott

Laboratories in 1939-1941. Wiener developed a mathematical theory for predicting the curved flight

paths of aircraft, and Bigelow designed and built a machine that performed the necessary computations.38

Wiener and Bigelow’s work on mechanical prediction was classified, but they were allowed to

communicate with other researchers interested in mechanical computation. In October 1941, Wiener and

34 Letter by Wiener to Haldane, dated June 22 1942. Norbert Wiener Papers, Box 4 ff 62.

35 McCulloch 1974, pp. 38-39.

36 This may be the source of Wiener’s early comparison between digital computing mechanisms and brains, which

Wiener attributes to himself—without crediting McCulloch—in Wiener 1958. If Wiener’s meeting with McCulloch

occurred before September 1940, and if Wiener is correct in attributing his September 1940 proposal of a digital

computer to an attempt to imitate the brain’s method of calculation, then McCulloch’s theory partially inspired one

of the first designs of a digital computer. Cf. Aspray 1985, p. 125.

37 Letter by Wiener to J. Robert Kline, dated April 10, 1941. Norbert Wiener Papers, Box 4, ff. 59.

38 Julian Bigelow’s Vita, undated, Papers of John von Neumann, Box 2, ff. 7.

39

Bigelow received a visit by John Atanasoff, the designer of the ABC computer.39 The ABC computer,

whose construction began in January 1941, was completed in May 1942 and was the first machine to

perform computations using electronic equipment.40 One day in 1942, Wiener and Bigelow “had a lively

discussion on the stairs in the math building at Harvard” with Howard Aiken, another leader in

mechanical computing.41 Aiken was working on a giant electro-mechanical digital computer, the Harvard

Mark 1, which was completed in 1944.

According to Wiener’s recollection, Bigelow convinced Wiener of the importance of feedback in

control mechanisms like those they were designing, and that all that mattered to automatic control of

mechanisms was not any particular physical quantity (such as energy or length or voltage) but only

“information” (used in an intuitive sense), conveyed by any means.42 Wiener seized on the importance of

feedback and information in control mechanisms, and turned it into a cornerstone of his thinking about

the most complex of all control mechanisms—the nervous system.

Wiener merged his work with Bigelow on feedback and information in control mechanisms with

the project on possible behaviors that he was carrying out with Rosenblueth. Rosenblueth, Wiener, and

Bigelow jointly wrote a famous paper on “teleological” mechanisms (Rosenblueth, Wiener, and Biegelow

1943). The paper classified behaviors as follows. First, it distinguished between purposeful (goal-

seeking) and non-purposeful behavior. Then, it divided purposeful behavior into teleological (involving

negative feedback) and non-teleological behavior. Teleological behavior, in turn, was divided into

predictive and non-predictive behavior. The authors argued that by taxonomizing behaviors in this way,

it was possible to study organisms and machines in the same behavioristic way, i.e. by looking at the

correlation between their inputs and outputs without worrying about their internal structure. These ideas,

and especially the role played by feedback in accounting for teleological behavior, would soon attract a

lot of attention. Their impact started to be felt before the paper was published.

39 Letter by Warren Weaver to Atanasoff, dated October 15, 1941. Norbert Wiener Papers, Box 4, ff. 61

40 On Atanasoff’s computer and its relation to later electronic computers, see Burks 2002.

41 Letter by Bigelow to Wiener, dated 7 August 1944, Norbert Wiener Papers, Box 4 ff. 66.

42 Wiener 1948, Introduction; see also McCulloch 1974, p. 39.

40

In April 1942, McCulloch was invited to a conference on Cerebral Inhibition, sponsored by the

Josiah Macy, Jr. Foundation.43 The meeting was organized by Frank Fremont-Smith, M.D., who was

Director of the Foundation’s Medical Division. The discussion at the meeting was to focus on

conditioned reflexes in animals and hypnotic phenomena in humans, both of which were believed to be

related to cerebral inhibition. “It is hoped that by focussing the discussion upon physiological

mechanisms underlying the two groups of phenomena,” wrote Fremont-Smith, “gaps in our knowledge,

as well as correlations, may be more clearly indicated.”44 The format of the conference, which took place

in New York City on May 14th and 15th 1942, included two formal presentations on conditioned reflexes

followed by two days of informal discussion among the participants. One of those participants was

Rosenblueth. During the meeting, Rosenblueth talked about the ideas he had developed with Wiener and

Bigelow about teleological mechanisms. According to McCulloch, Rosenblueth “implanted the feedback

explanation of purposive acts in [Fremont-Smith]’s mind.”45

Fremont-Smith hoped to get the same group to meet again,46 and McCulloch saw this as an

opportunity to foster his ideas about mind and brain.47 As soon as McCulloch found the time, he wrote a

long letter to Fremont-Smith, in which he indicated:

… what I hope we can chew over at great length when we are next together, for they are points

which I feel are very important to the understanding of the problems confronting us; and I feel

sure that your procedure in keeping the group together, discussing long enough to get through the

words to the ideas, is the most profitable form of scientific investigation of such problems.

In his letter, McCulloch outlined his views about neural explanations of mental phenomena, including his

use of symbolic logic to model neural activity, and endorsed Rosenblueth’s point about “the dependence

of ‘goal directed’ behavior upon ‘feed-back’ mechanisms.”48

McCulloch’s connection with Fremont-Smith soon turned into a friendship.49 Their relationship

would bear fruits in a few years, under the guise of several grants by the Macy Foundations to

43 Letter by Fremont-Smith to McCulloch, dated April 27, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.

44 Memorandum by Fremont-Smith, dated May 11, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.

45 Letter by McCulloch to Rosenblueth, dated February 14, 1946. Warren S. McCulloch Papers, ff. Rosenblueth.

46 Letter by Fremont-Smith to McCulloch, dated April 27, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.

47 McCulloch 1974, p. 39.

48 Letter by McCulloch to Fremont-Smith, dated June 24, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.

41

McCulloch’s lab as well as what came to be known as the Macy Meetings on cybernetics. By 1942,

though, McCulloch was about to publish his long-in-the-work theory of the brain, with help from a new

and important character.

3.3 Walter Pitts

Walter Pitts fled his parental home around the age of 15, and never spoke to his family again.50 In

1938—at the age of 15—he attended a lecture by Bertrand Russell at the University of Chicago. During

the lecture, he met an eighteen-year-old fellow in the audience, Jerome Lettvin, who was preparing for

medical school by studying biology at the University of Chicago. Pitts and Lettvin became best friends.51

According to Lettvin, by the time he met Lettvin, Pitts “had, for a long time, been convinced that

the only way of understanding nature was by logic and logic alone.”52 Here is Lettvin’s recollection of

the origin of that view, as well as a poignant description of Pitts’s unique personality:

Pitts was married to abstract thought. Once, Pitts told us that when he was twelve years old he

was chased by some bullies into a public library, where he hid in the stacks. There he picked up

Russell and Whitehead’s Principia Mathematica and could not put it down. For the next week he

lived in the library from opening to closing time, going through all three volumes. It seemed to

him then that logic was magic, and if he could master that magic and practice it, the whole world

would be in his hegemony—he would be Merlin. But to do this one had to do away with self.

Ego must never enter, but only Reason. And at that moment of revelation he committed

ontological suicide. That is the peculiar truth about Pitts, whom all of us loved and protected.

We never knew anything about his family or his feelings about us. He died mysterious, sad and

remote, and not once did I find out, or even want to find out more about how he felt or what he

hoped. To be interested in him as a person was to lose him as a friend.53

Other witnesses concurred with Lettvin’s assessment. People who knew him personally described Pitts as

shy, introverted, and socially awkward.54

49 At least by August 1943, the two started addressing their letters “Dear Warren” and “Dear Frank” rather than

“Dear Doctor McCulloch” and “Dear Doctor Fremont-Smith.” Warren S. McCulloch Papers, ff. Fremont-Smith.

50 Lettvin 1989b, p. 514.

51 Heims 1991, p. 40; Smalheiser 2000, p. 219. Letter by Lettvin to Wiener, dated ca. April, 1946. Norbert Wiener

Papers, Box 4, ff. 70. Accounts of Pitts’s life contain fictionalized stories, apparently propagated by McCulloch.

Smalheiser gives a nice summary of Pitt’s life, work and personality. The most reliable source on Pitts seems to be

his “life-long friend” Lettvin (Lettvin 1989a, 1989b)

52 Lettvin 1989a, p. 12.

53 Lettvin 1989b, p. 515.

54 Smalheiser 2000, p. 220-221.

42

Nevertheless, there is consensus that Pitts became knowledgeable and brilliant. The “magic” he

performed at twelve with the Principia Mathematica may have worked, because Lettvin continued his

story as follows:

[I]f a question were asked about anything whatever—history, literature, mathematics, language,

any subject at all, even empirics such as systematic botany or anatomy, out would come an

astonishing torrent, not of disconnected bits and pieces of knowledge, but an integral whole, a

corpus, an organized handbook with footnotes and index. He was the very embodiment of mind,

and could out-think and out-analyze all the rest of us.55

In the late 1930s, Pitts started auditing classes at the University of Chicago, without enrolling as a

student. He studied logic with Carnap and biophysics with Nicolas Rashevsky.56 Rashevsky was a

Russian physicist who had established the Committee on Mathematical Biology, a pioneering research

group in biophysics that included Frank Offner, Herbert Landahl, and Alston Householder. Rashevsky

advocated the development of mathematical models of idealized biological processes, applying to biology

the methodology of theoretical physics (Rashevsky 1936, 1937, 1938).57 One area on which Rashevsky

and his group worked was the nervous system.

Pitts became a member of Rashevsky’s group, quickly starting to do original research without

ever earning a degree. In the early 1940s, Pitts published several papers on neural networks in

Rashevsky’s journal, the Bulletin of Mathematical Biophysics (Pitts 1942a, 1942b, 1943). According to

Lettvin, it was during this time, namely before meeting McCulloch, that Pitts developed the view that the

brain is a “logical machine.”58

55 Lettvin 1989b, p. 515.

56 Glymour notes that Carnap’s role in the history of computationalism includes his teaching both Pitts and another

pioneer, Herbert Simon (Glymour 1990). To them, I should add another student of Carnap that was going to play an

important role in the history of computationalism, namely Ray Solomonoff. Unfortunately, a search through the

Rudolf Carnap Collection at the Archives of Scientific Philosophy, University of Pittsburgh, uncovered no

information on Carnap’s relationship to Pitts, Simon, or Solomonoff. Aside from Carnap’s teaching these

individuals (and inspiring Solomonoff’s early work), I have found no evidence of a significant impact by Carnap or

his work on the main founders of computationalism.

57 On Rashevky and his group, see Abraham 2001b; Abraham 2002, pp. 13-18.

58 Lettvin interview, in Anderson and Rosenfeld 1998, p. 3. Lettvin also put it this way (using slightly anachronistic

terminology):

Quite independently, McCulloch and Pitts set about looking at the nervous system itself as a logical

machine in the sense that if, indeed, one could take the firings of a nerve fiber as digital encoding of

information, then the operation of nerve fibers on each other could be looked at in an arithmetical sense as

a computer for combining and transforming sensory information (Lettvin 1989a, p. 10).

43

3.4 McCulloch Meets Pitts

In the fall of 1941, McCulloch moved to the University of Illinois in Chicago. He was hired by the

Department of Psychiatry to build up a team of specialists and study the biological foundations of mental

diseases.59 At the University of Illinois, the lab’s electrical engineer was Craig Goodwin, who knew the

theory and design of control devices. Goodwin introduced McCulloch to this area of research, which

included topics like automatic volume controls and self-tuning devices. According to McCulloch, he

learned from Goodwin that when the mathematics of the hardware, e.g. coupled nonlinear oscillators, was

intractable, i.e. when the equations representing the system could not be solved analytically, they could

still build a working model and use it to think about the problem.60

In 1939, Lettvin started medical school at the University of Illinois, where his anatomy teacher

was Gerhard von Bonin. After McCulloch moved to Chicago in 1941, von Bonin introduced Lettvin to

McCulloch.61 McCulloch, who once called Lettvin “the brightest medical student I have ever known,”

exerted a strong influence on Lettvin, and later convinced him to do research on the brain.62 Once in

Chicago, McCulloch also made contact with Rashevsky’s group. He started attending the group’s

seminar, where Lettvin introduced him to then almost eighteen-year-old Pitts.63 Like Carnap and

Rashevsky before him, McCulloch was “much impressed” by Pitts.64

When McCulloch presented his ideas about information flow through ranks of neurons to

Rashevsky’s seminar, Pitts was in the audience. Pitts showed interest in a problem that McCulloch had

59 McCulloch 1974, p. 35.

60 McCulloch 1974, p. 35. This may be significant in light of McCulloch’s later ideas about building mechanical

models of the brain.

61 Lettvin 1989b, p. 514.

62 Letter by McCulloch to Henry Moe, dated December 30, 1959. Warren S. McCulloch Papers, ff. Gerard. Letter

by Lettvin to Wiener, dated ca. April, 1946. Norbert Wiener Papers, Box 4, ff. 70.

63 Lettvin 1989b, p. 515.

64 McCulloch 1974, pp. 35-36.

44

struggled with: the problem of how to give mathematical treatment to regenerative nervous activity in

closed neural loops.65

McCulloch hypothesized that closed neural loops explained neural processes that, once started,

continued on by themselves. Initially, McCulloch was thinking about pathologies, such as the neural

activity of epileptic patients and other pathological conditions, including phantom limbs, compulsive

behavior, anxiety, and the effects of shock therapy. But Lawrence Kubie had postulated closed loops of

activity to explain memory (Kubie 1930), and Lorente de Nó had shown the significance of closed loops

in vestibular nistagmus (Lorente de Nó 1938). This convinced McCulloch that closed loops of activity

could fulfill positive neural functions. By the time McCulloch met Pitts, McCulloch thought closed loops

could account for memory and conditioning, but he still didn’t know how to think mathematically about

them.66

McCulloch and Pitts started working together; they worked so closely that Pitts (as well as

Lettvin) moved in with McCulloch and his family for about a year in Chicago. McCulloch and Pitts

became intimate friends and they remained so until their death in 1969.67 For two years, they worked

largely on the problem of how to treat closed loops of activity mathematically. According to McCulloch,

the solution was worked out mostly by Pitts using techniques that McCulloch didn’t understand. To build

up their formal theory, they adopted what they saw as Carnap’s rigorous terminology, which Pitts knew

from having studied with Carnap. Thus, according to McCulloch, Pitts did all the difficult technical

work.68 The resulting paper was published in Rashevsky’s journal in 1943, with a brief follow-up written

by McCulloch and Pitts with Herbert Landahl, another member of Rashevsky’s group, on a statistical

application of the theory.

65 McCulloch 1974, pp. 35-36.

66 McCulloch 1974, p. 36.

67 Shortly before both of them died, Pitts wrote McCulloch from his hospital bed, commenting in detail on their

conditions and expressing the wish that they meet again and talk about philosophy. Letter by Pitts to McCulloch,

dated April 21, 1969. Warren S. McCulloch Papers, ff. Pitts.

68 McCulloch 1974, p. 36.

45

4 BRAINS COMPUTE THOUGHTS, 1943

One would assume, I think, that the presence of a theory, however strange, in a field in which no

theory had previously existed, would have been a spur to the imagination of neurobiologists…

But this did not occur at all! The whole field of neurology and neurobiology ignored the

structure, the message, and the form of McCulloch’s and Pitts’s theory. Instead, those who were

inspired by it were those who were destined to become the aficionados of a new venture, now

called Artificial Intelligence, which proposed to realize in a programmatic way the ideas

generated by the theory (Lettvin 1989a, p. 17).

4.1 A Mechanistic Theory of Mind

McCulloch believed that the goal of neurophysiology and psychiatry was to explain the mind, and that

scientists had not seriously tried to construct a neural theory to this effect. A serious obstacle was what

philosophers called the mind-body problem. In a commentary to a paper presented in May 1943 at the

Illinois Psychiatric Society, McCulloch explained:

We have a dichotomy in medicine, which has grown increasingly… Psychiatric approach on one

side, particularly the psychoanalytic approach, has produced one group; the organic approach to

the physiology of particular organs and disease processes has made organicists of another group.

It has grown difficult for us to talk to each other. I am afraid that there is still in the minds of

most of us, and that there probably will be for years, that difficulty which concerned and still

concerns many thinking people—I mean the dichotomy between mind and body.1

McCulloch continued his commentary by saying that there were “two types of terminology”:

“mental terms” were used to describe “psychological processes, for these exhibit ideas and intentions”;

“physical terms” were used to describe “bodily processes, for these exhibit matter and energy.” But:

… it remains our great difficulty that we have not ever managed to conceive how our patient—

our monad—can have a psychological aspect and a physiological aspect so divorced. You may

think that I am exaggerating the difficulty here, but there have appeared within the last few years

two books which tilt at the same windmill. One is Sherrington, called “Man and His Nature,” and

in it Sherrington, the marvelously honest physiologist, attempts to make head and tail of the

mind-body relation, but is frustrated because in that world “Mind goes more ghostly than a

ghost.” The other book, by Wolfgang Koehler (the founder of Gestalt psychology), is entitled

“The Place of Value in a World of Fact,” but in spite of his endless searching, you will be

convinced that he has not found the place of value in the world of fact. Such was the

unsatisfactory state of our theory until very recently.2

1 Discussion by Dr. McCulloch of a paper by Dr. Alexander on Fundamental Concepts of Psychosomatic Research,

Illinois Psychiatric Society, dated May 22, 1943. Warren S. McCulloch Papers.

2 Ibid. The works referenced by McCulloch are Sherrington 1940 and Köhler 1938.

46

After thus stating the mind-body problem, McCulloch pointed at two recent developments that gave hope

for its solution.

As an answer to the question of “the place of values in a world of fact,” McCulloch cited the

newly published work of Rosenblueth, Wiener, and Bigelow (1943), which used the notion of feedback to

account for teleological behavior. As to what McCulloch called the “formal” aspect of mind, he promised

he was going to have something to contribute soon:

At the present time the other mental aspect of behavior—I mean its ideational or rational, formal

or logical aspect—is coming to the fore. This work … should be coming to fruition in the next

year or two… We do resent the existing hiatus between our mental terminology and our physical

terminology. It is being attacked in a very realistic fashion today. So while we do at the moment

think of it as a “leap from psyche to soma,” we are busy bridging the gap between mental

processes and physical processes. To this audience it is interesting that that bridge is being made

by demonstrating that the properties of systems which are like our nervous system necessarily

show those aspects of behavior that make us call it “mental”—namely, ideas and purposes.3

The explanation for the “formal” aspect of the mind, and hence the solution to that component of the

mind-body problem, was about to be offered by McCulloch in the paper he was writing with Walter Pitts.

Their way of solving the problem was to demonstrate how a system of neuron-like elements embodied

ideas.

In a letter by McCulloch to Frank Fremont-Smith, written a few months before the commentary

cited above, McCulloch was more detailed and explicit as to what he hoped to accomplish with his theory

and the role that logic played in it:

As to the “formal” properties [of the mind], it is perfectly possible today (basing the work on the

all-or-none law and the requirement of summation at a synapse and of inhibition either at a

synapse or by preoccupation of a requisite pool of internuncials) to show that neuronal reactions

are related to antecedent neuronal reactions—I mean reactions in parts of the nervous system

afferent to the reaction in question—in a manner best schematized by symbolic logic; in brief,

that the efferent impulses are related to the afferent impulses as logical consequences are related

to logical antecedents, and hence that classes of the latter are so related to classes of the former.

Little consideration is necessary to show that neuronal and all other reactions which

derive their energy metabolically and are triggered off by something else, being reactions of the

zero order with respect to what initiates them, bear to their precipitating causes the same relation

that propositions do to that which they propose. If then, from the sense organ forward, the

reaction of subsequent neurones is dependent upon any selection from the totality of energy

delivered to the system, the response corresponds to an abstraction from that totality, so that

3 Ibid.

47

neural behavior is not only essentially propositional but abstract with respect to its precipitating

cause.4

Once again, McCulloch was describing the work he was pursuing with Pitts. The all-or-none law of

neural activity—namely that the effects of neural activity depended only on the number of nerve impulses

traveling through the nervous system—allowed McCulloch and Pitts to use symbolic logic to describe

neural activity, so that inferential relations among propositions described causal streams of neural events.

This, for McCulloch, was enough to show that “neural behavior is essentially propositional” in a way that

explained mechanistically the “formal” aspect of the mind.

The sense in which neural behavior was essentially propositional was further clarified by

McCulloch in a letter to a neurophysiologist at the University of Chicago, Ralph Lillie. In February 1943,

he explained how “we might be able to see mechanistically the problem of ideas”:

[W]hat was in my mind was this: that neuronal activity bore to the world external to the

organism the relationship that a proposition bears to that to which it proposes. In this sense,

neuronal activity so reflects the external world as to account for that all-or-none characteristic of

our logic (and of our knowledge) which has been one of the greatest stumbling blocks to

epistemology. I think that for the first time we are in a position to regard scientific theory as the

natural consequence of the neuronal activity of an organism (here the scientist)… And this has

come about because the observed regularity—all-or-none of neurones, bears a one-to-one

correspondence to those peculiar hypothetical psychic atoms called psychons which preserve in

the unity of their occurrence both the all-or-none law and the property of reference characteristic

of propositions.5

Thanks to the all-or-none law, neural events stood in “one-to-one correspondence” to psychons, and just

like psychons and propositions, neuronal activity had “the property of reference.”

To solve the mind-body problem, McCulloch and Pitts formulated what they called a “logical

calculus of the ideas immanent in nervous activity” (McCulloch and Pitts 1943). As Frederic Fitch

pointed out in reviewing the paper for the Journal of Symbolic Logic, this was not quite a logical calculus

in the sense employed by logicians.6

A common misconception is that McCulloch and Pitts demonstrated that neural nets can compute

anything that Turing Machines can:

4 Letter by McCulloch to Fremont-Smith, dated June 24, 1942. Warren S. McCulloch Papers, ff. Fremont-Smith.

5 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.

6 Fitch 1944.

48

McCulloch and Pitts proved that a sufficiently large number of these simple logical devices,

wired together in an appropriate manner, are capable of universal computation. That is, a

network of such ‘lineal threshold’ units with the appropriate synaptic weights can perform any

computation that a digital computer can, though not as rapidly or as conveniently.7

As we shall see, this is incorrect in two respects. First, McCulloch and Pitts did not prove any results

about what their nets can compute, although they claimed that there were results to prove; second,

McCulloch-Pitts nets—as the McCulloch and Pitts recognized—are computationally less powerful than

Turing Machines.

The computation power of McCulloch-Pitts nets is only one among many issues raised by their

theory. Although this paper is cited often, it has received little careful attention. The great historical

importance of this paper, and its common misrepresentation, warrant that we study it closely. The rest of

this chapter is devoted to it.

4.2 Motivation

The paper started with the rehearsal of some established neurophysiological facts: the nervous system

was a network of neurons connected through synapses; neurons sent to each other excitatory and

inhibitory pulses8; and each neuron had a threshold determining how many excitatory and inhibitory

inputs are necessary and sufficient to excite it at a given time.9

Then, the authors introduced the main premise of their theory: the identification of neuronal

signals with propositions. This was presumably what justified their title, which mentioned a calculus of

ideas immanent in nervous activity. They introduced this identification in a curious and rather obscure

7 Koch and Segev 2000, p. 1171.

8 According to Lettvin, an important source of the logic gate model of the neuron was the recent discovery by David

Lloyd of direct excitation and inhibition between single neurons: “it was not until David Lloyd’s work in 1939-41

that the direct monosynaptic inhibitory and excitatory actions of nervous pulses were demonstrated. This finding,

more than anything else, led Warren and Walter to conceive of single neurons as doing logical operations (a la

Leibnitz and Boole) and acting as gates” (Lettvin’s 1988, Foreward to the second edition of Embodiments of Mind,

cited by Heims 1991, pp. 233-234). In light of McCulloch’s professions of belief in his logical conception of the

nervous system since the early 1930s, it is unclear how crucial Lloyd’s work was in motivating McCulloch and Pitts,

besides providing experimental validation of some of their ideas.

9 McCulloch and Pitts 1943, pp. 19-21.

49

way, appealing not to some explicit motivation but to “considerations,” made by one of the authors,

which they did not give:

Many years ago one of us, by considerations impertinent to this argument, was led to conceive of

the response of any neuron as factually equivalent to a proposition which proposed its adequate

stimulus. He therefore attempted to record the behavior of complicated nets in the notation of the

symbolic logic of propositions. The “all-or-none” law of nervous activity is sufficient to insure

that the activity of any neuron may be represented as a proposition. Physiological relations

existing among nervous activities correspond, of course, to relations among the propositions; and

the utility of the representation depends upon the identity of these relations with those of the logic

of propositions. To each reaction of any neuron there is a corresponding assertion of a simple

proposition. This, in turn, implies either some other simple proposition or the disjunction or the

conjunction, with or without negation, of similar propositions, according to the configuration of

the synapses upon and the threshold of the neuron in question.10

In light of what was said in Chapters 2 and 3, the author of the “considerations” was McCulloch, and the

considerations were those that led him to formulate first his theory of psychons, and then his theory of

information flow through ranks of neurons. A proposition that “proposes a neuron’s adequate stimulus”

was a proposition that said that the neuron received a certain input at a certain time. The authors did not

explain what they meant by “factual equivalence” between neuronal pulses and propositions, but their

language suggested they meant both that neuronal pulses were represented by propositions, and that

neuronal pulses had propositional content.

The theory was divided into two parts: one dealing with nets without closed loops of neural

activity, which in this paper are referred to as “circles,” the other dealing with nets with circles (cyclic

nets). The authors pointed out that the nervous system contains many circular, “regenerative” paths.11

The term “circle” may have been borrowed from Turing (1936-7), who had used “machines with circles”

for Turing Machines whose computations never halt, and “machines without circles” for Turing Machines

whose computations eventually halt.

4.3 Assumptions

In formulating the theory, McCulloch and Pitts made the following five assumptions:

10 Ibid., p. 21; emphasis added.

11 Ibid., p. 22.

50

1. The activity of the neuron is an “all-or-none” process.

2. A certain fixed number of synapses must be excited within the period of latent addition in

order to excite a neuron at any time, and this number is independent of previous activity and

position on the neuron.

3. The only significant delay within the nervous system is synaptic delay.

4. The activity of any inhibitory synapse absolutely prevents excitation of the neuron at that

time.

5. The structure of the net does not change with time.12

These assumptions constituted an idealization of the known properties of neural nets.

Assumption (1) was simply the all-or-none law: neurons were believed to either pulse or be at rest. As to

(2), it is not strictly true but in many cases it was considered a good approximation. As to (3), this was

probably the least explicit and least physiologically justified assumption of the theory. Under the heading

of “synaptic delay,” McCulloch and Pitts assumed that the timing of the activity of neural nets was

uniformly discrete, such that any neural event in a neural net occurred within one time interval of fixed

duration. This assumption had the effect of discretizing the continuous temporal dynamics of the net, so

that logical functions of discrete states could be used to describe the transitions between neural events.

As to (4) and (5), McCulloch and Pitts admitted that they are false of the nervous system. However,

under the other assumptions, they showed that nets that do not satisfy (4) and (5) are functionally

equivalent to nets that do.13

McCulloch and Pitts were perfectly aware that the neuron-like elements in their theory were quite

distant from real neurons: “Formal neurons were deliberately as impoverished as possible.”14 In a letter

12 Ibid., p. 22.

13 Ibid., pp. 29-30.

14 McCulloch 1974, p. 36.

51

written to a colleague asking for clarification after a public presentation of the theory, McCulloch wrote

as follows:

[W]e in our description restricted ourselves to the regular behavior of the nervous system,

knowing full well that irregularities can be and are frequently brought about by physical and

chemical alterations of the nervous system. As a psychiatrist, I am perhaps more interested in

these than in its regular activity, but they lead rather to a theory of error than a theory of

knowledge, and hence were systematically excluded from the description.15

In McCulloch’s eyes, the differences between real neurons and the elements employed in his theory were

inessential. His goal was not to understand neural mechanisms per se, but rather to explain how

something close enough to a neural mechanism could exhibit “knowledge,” the kind of “ideational,”

“rational,” “formal,” or “logical” aspect that was associated with the mind. McCulloch’s goal was to

offer, for the first time, an explanation of the mind in terms of neural-like mechanisms.

4.4 Nets Without Circles

McCulloch and Pitts’s technical language was cumbersome; here their theory is given in a slightly

streamlined form that makes it easier to follow. The neurons of a net N are denoted by c1, c2, … cn. A

primitive expression of the form Ni(t) means that neuron ci fires at time t. Expressions of the form Ni(t)

can be combined by means of logical connectives to form complex expressions that describe the behavior

of different neurons at certain times. For example, N1(t)&N2(t) means that neurons c1 and c2 fire at time t,

N1(t-1)∨N2(t-2) means that either c1 fires at t-1 or c2 fires at t-2 (or both), etc. These complex expressions

can in turn be combined by the same logical connectives. As well-formed combinations, McCulloch and

Pitts allowed only the use of conjunction (A&B), disjunction (A∨B), conjunction and negation (A&~B),

and a special connective S that shifts the temporal index of an expression backwards in time, so that

S(Ni(t)) = Ni(t-1). A complex expression formed from a number of primitive expressions N1(t), … Nn(t)

15 Letter by McCulloch to Ralph Lillie, ca. February 1943. Warren S. McCulloch Papers, ff. Lillie.

Cf. also Lettvin:

The Logical Calculus, McCulloch knew, was not even a caricature of any existing nervous process. Indeed

he made that very clear at the time of writing. But is [sic] was a possible and useful assembly of

axiomatized neurons, and that seemed to him a far greater accomplishment than a true description of any

definitely known neuronal circuit (of which none then existed) (Lettvin 1989b, p. 518).

52

by means of the above connectives is denoted by Expj(N1(t), … Nn(t)). In any net without circles, there

are some neurons with no axons inputting on them; these are called afferent neurons.

The two main technical problems McCulloch and Pitts wanted to solve were “to calculate the

behavior of any net, and to find a net which will behave in a specified way, when such a net exists.”16 In

terms of the theory, the problems can be formulated as follows:

First problem: given a net, find a class of expressions C such that for every neuron ci, in C there

is a true expression of the form

Ni(t) if and only if Expj((Ni-g(t-1), … Ni-2(t-1), Ni-1(t-1)),

where neurons ci-g, … ci-2, and ci-1 have axons inputting ci.

The significance of this expression is that it describes the behavior of any (non-afferent) neuron in terms

of the behavior of the neurons that are afferent to it. If a class C of such expressions is found,

propositional logic can describe the behavior of any non-afferent neuron in the net in terms of the

behavior of the neurons afferent to it.

Second problem: given an expression of the form

Ni(t) if and only if Expj((Ni-g(t-1), … Ni-2(t-1), Ni-1(t-1)),

find a net for which it is true.

McCulloch and Pitts showed that these problems were easily solved. To solve the first problem, they

showed how to write an expression describing the relation between the firing of any neuron in a net and

the inputs it receives from its afferent neurons. To solve the second problem, they showed how to

construct nets that satisfy their four combinatorial schemes (conjunction, disjunction, conjunction-cum-

negation, and temporal predecessor), giving diagrams that show the connections between neurons that

satisfy each scheme (figure 4-1). Then, by induction on the size of the nets, all expressions formed by

those combinatorial schemes are realizable by McCulloch-Pitts nets.17

16 Ibid., p. 24.

17 Their actual proof was not quite a mathematical induction because they didn’t show how to combine nets of

arbitrary size, but the technical details are unimportant here.

53

Figure 4-1. Diagrams of McCulloch and Pitts nets.

By giving diagrams of nets that satisfy simple logical relations between propositions and by

showing how to combine them to satisfy more complex logical propositions, McCulloch and Pitts

developed a powerful technique for designing circuits that satisfy given logical functions by using a few

primitive building blocks.18

McCulloch and Pitts’s goal was to explain mental phenomena. As an example, they offered an

explanation of a well-known heat illusion by constructing an appropriate net. A cold object touching the

skin normally causes a sensation of cold, but if it is held for a very brief time and then removed, it can

cause a sensation of heat. In designing their net, McCulloch and Pitts reasoned as follows. They started

from the known physiological fact that there are different kinds of receptors affected by heat and cold,

and they assumed that there are neurons whose activity “implies a sensation” of heat.19 Then, they

assigned one neuron to each function: heat reception, cold reception, heat sensation, and cold sensation.

Finally, they observed that the heat illusion corresponded to the following relations between three

18 This is the main aspect of their theory used by von Neumann in describing the design of digital computers (see the

next chapter). Today, McCulloch and Pitts’s technique is part of logic design, an important area of computer design

devoted to designing digital circuits for digital computers. The building blocks of contemporary logic design are

called logic gates. In modern terminology, McCulloch and Pitts’s nets are logic gates and combinations of logic

gates. For more on logic and computer design, see Chapter 10.

19 Ibid., p. 27.

54

neurons: the heat-sensation neuron fires either in response to the heat receptor or to a brief activity of the

cold receptor (figure 4-2).

Figure 4-2. Net explaining heat illusion. Neuron 3 (heat sensation) fires if and only if it receives two

inputs, represented by the lines terminating on its body. This happens when either neuron 1 (heat

reception) fires or neuron 2 (cold sensation) fires once and then immediately stops firing. When

neuron 2 fires twice in a raw, the intermediate (unnumbered) neurons excite neuron 4 rather than

neuron 3, generating a sensation of cold.

McCulloch and Pitts used this example for a general observation about the relation between

perception and the world:

This illusion makes very clear the dependence of the correspondence between perception and the

“external world” upon the specific structural properties of the intervening nervous net.20

Then, they pointed out that, under other assumptions about the behavior of the heat and cold receptors, the

same illusion could be explained by different nets (ibid., p. 28).

4.5 Nets With Circles, Computation, and the Church-Turing Thesis

The problems for nets with circles are analogous to those for nets without circles: given the behavior of a

neuron’s afferents, find a description of the behavior of the neuron; and find the class of expressions and a

method of construction such that for any expression in the class, a net can be constructed that satisfies the

expression. The authors pointed out that the theory of nets with circles is more difficult than the theory of

nets without circles. This is because activity around a circle of neurons can continue for an indefinite

amount of time, hence expressions of the form Ni(t) may have to refer to times that are indefinitely remote

20 Ibid., p. 28.

55

in the past. For this reason, the expressions describing nets with circles are more complicated, involving

quantification over times. McCulloch and Pitts offered solutions to the problems of nets with circles, but

their treatment of this part of the theory was very obscure, admittedly sketchy,21 and contained some

errors that make it hard to follow.22

At the end of this section, McCulloch and Pitts drew the connection between their nets and

computation:

It is easily shown: first, that every net, if furnished with a tape, scanners connected to afferents,

and suitable efferents to perform the necessary motor-operations, can compute only such numbers

as can a Turing machine; second, that each of the latter numbers can be computed by such a net;

and that nets with circles can be computed by such a net; and that nets with circles can compute,

without scanners and a tape, some of the numbers the machine can, but no others, and not all of

them. This is of interest as affording a psychological justification of the Turing definition of

computability and its equivalents, Church’s λ-definability and Kleene’s primitive recursiveness:

If any number can be computed by an organism, it is computable by these definitions, and

conversely.23

This brief passage is the only one mentioning computation. By stating that McCulloch-Pitts nets

compute, this passage provided the first known published link between computation and brain theory. It

was a pivotal statement in the history of computationalism.

It is often said that McCulloch and Pitts proved that their nets could compute anything that Turing

Machines can compute (e.g., Koch and Segev 2000). This misconception was initiated and propagated by

McCulloch himself. For instance, in summarizing the significance of their paper, McCulloch wrote to a

colleague:

[T]he original paper with Pitts entitled “A Logical Calculus of Ideas Immanent in Nervous

Activity” … sets up a calculus of propositions subscripted for the time of their appearance for any

net handling all-or-none signals, and shows that such nets can compute any computable number

or, for that matter, do anything any other net can do by the way of pulling consequences out of

premises.24

21 Ibid., p. 34.

22 Every commentator points this out, starting with Fitch 1944, p. 51. See also Arbib 1989. McCulloch and Pitts’s