Content uploaded by Leo Gugerty
Author content
All content in this area was uploaded by Leo Gugerty on May 21, 2017
Content may be subject to copyright.
NEWELL AND SIMON’S LOGIC THEORIST:
HISTORICAL BACKGROUND AND IMPACT ON COGNITIVE MODELING
Leo Gugerty
Psychology Department, Clemson University, Clemson, SC USA
Fifty years ago, Newell and Simon (1956) invented a “thinking machine” called the
Logic Theorist. The Logic Theorist was a computer program that could prove theorems in
symbolic logic from Whitehead and Russell’s Principia Mathematica. This was perhaps
the first working program that simulated some aspects of peoples’ ability to solve
complex problems. The Logic Theorist and other cognitive simulations developed by
Newell and Simon in the late 1950s had a large impact on the newly developing field of
information-processing (or cognitive) psychology. Many of the novel ideas about mental
representation and problem solving instantiated in the Logic Theorist are still a central
part of the theory of cognitive psychology, and are still used in modeling the complex
tasks studied in human factors psychology. This paper presents some of the theoretical
precursors of the Logic Theorist, describes the principles and implementation of the
program, and discusses its immediate and long-term impacts.
INTRODUCTION
Fifty years ago, in early 1956, Herbert Simon told
a group of students that “over the Christmas holiday, Al
Newell and I invented a thinking machine” (Simon,
1996, p. 206). This thinking machine was a program
called the Logic Theorist (Newell & Simon, 1956). Its
thinking consisted of creating proofs for theorems in
propositional logic. In fact, it could prove 38 of the 52
theorems in Chapter 2 of Whitehead and Russell’s
Principia Mathematica (1910). The Logic Theorist was
one of the first, and perhaps the first, working program
that simulated some aspects of peoples’ ability to solve
complex problems. The Logic Theorist and other
cognitive simulations developed by Newell, Simon and
Cliff Shaw in the late 1950s had a large impact on the
newly developing field of information-processing (or
cognitive) psychology. Many of the novel ideas about
mental representation and problem solving instantiated
in the Logic Theorist are still a central part of the theory
of cognitive psychology, and many of these ideas are
still used in modeling the complex tasks studied in
human factors psychology. This paper presents some of
the theoretical ideas that influenced the development of
the Logic Theorist, describes the principles and
implementation of the program, and discusses its
immediate and long-term impacts.
FORERUNNERS
In his autobiography, Herbert Simon (1996) notes
that he and Allen Newell were influenced by
theoreticians in mathematical logic and information
theory – such as Gödel (1931), Turing (1936) and
Shannon (1948) – who showed that complex
mathematical ideas and processes could be represented
by formal systems of symbols that were manipulated
according to well-defined rules. Some of these logicians,
especially Turing, claimed that these formal symbol
systems could be instantiated in physical machines. In
the late 1940s, these ideas led directly to the creation of
digital computers. Although digital computers initially
were used to implement primarily numerical
calculations, Turing (1950) and others felt strongly that
computer implementations of formal symbol systems
such as the Turing Machine could eventually exhibit a
variety of complex thinking processes. In addition to this
influence from formal logic, Simon also notes the
influence of researchers in cybernetics – such as
McCulloch and Pitts (1943) – who developed formal
models of mental processes with a close connection to
neurophysiology.
A number of researchers have noted that the
emergence of information-processing and cognitive
psychology was also influenced by a shift in the type of
tasks researchers attempted to study and model. In
particular, researchers shifted from the very simple tasks
of behaviorist psychology to the complex tasks studied
in human factors research, i.e., tasks involving problem
solving, communication and use of technology. For
example, after studying communication and decision-
making in submarine crews in 1947, Jerome Bruner
(1983, p. 62) commented “develop a sufficiently
complex technology and there is no alternative but to
develop cognitive principles in order to explain how
people can manage it.” Another key driving force in
cognitive psychology, George Miller, studied how noise
affected radio communication during WW II (Waldrop,
2001). Simon (1996) cites human factors work during
WW II as influencing his thinking. And in work prior to
the Logic Theorist, Simon and Newell studied
organizational decision-making and military command
& control systems.
In the early 1950s, Simon and Newell began
collaborating and set themselves the goal of developing
a formal symbolic system that could execute a complex
thought process. They first focused on the task of chess
and then moved on to geometry proofs, but later dropped
both of these visual tasks because of difficulties in
formalizing the perceptual processes involved. In the fall
of 1955, Simon and Newell switched to modeling a less
visual task – proving theorems in propositional logic.
DEVELOPING THE LOGIC THEORIST
The major insight that helped Newell and Simon
understand how people generated logic proofs was to
focus on peoples’ heuristics. While an undergraduate at
Stanford, Newell had learned about the importance of
heuristics in problem solving from the mathematician
George Polya. Simon and Newell discovered potential
heuristics by noticing and recording their own mental
processes while working on proofs. By December of
1955, they had implemented some promising heuristics
in a fairly complete version of the Logic Theorist and
hand-simulated the operation of this program. In January
1956, they performed a detailed hand-simulation with
their family members and students acting out the various
“methods” of the program. In conjunction with this work
on the Logic Theorist program, Newell and Shaw
worked on developing a list-processing language (IPL)
that could implement the program on a computer. In
August 1956, the first Logic Theorist proof was run on a
JOHNNIAC computer (named after John von Neumann)
using IPL. In September 1956, the first published
description of the Logic Theorist was presented at the
Second Symposium on Information Theory at MIT
(Newell & Simon, 1956). Since the Logic Theorist was
then capable of proving a number of the theorems in
Whitehead and Russell’s Principia Mathematica, Simon
informed Russell of this fact, no doubt savoring the
irony that their program developed based on work in
symbolic logic could now generate and prove important
theorems in symbolic logic.
DESCRIPTION OF THE LOGIC THEORIST
The basic principles underlying the Logic Theorist
are:
• Thinking is seen as processing (i.e., transforming)
symbols in short-term and long-term memories. These
symbols were abstract and amodal, that is, not
connected to sensory information.
• Symbols are seen as carrying information, partly by
representing things and events in the world, but mainly
by affecting other information processes so as to guide
the organism’s behavior. In Newell and Simon’s
words, “symbols function as information entirely by
virtue of their making the information processes act
differentially” (1956, p. 62).
• Symbols represent knowledge hierarchically. For
example, the representation of a logic expression in the
Logic Theorist was hierarchical, with elements and
sub-elements. Also the processes used by the Logic
Theorist were hierarchical, in that processes would set
sub-goals that initiated new processes.
• Complex problems are solved by the use of heuristics
that are fairly efficient but do not guarantee solution.
• The Logic Theorist works backwards from the
theorem to be proved by using the heuristics to make
valid inferences until it has reached an axiom.
The claim here is not that Newell and Simon
initiated any of these principles, but that they integrated
and applied them to develop a working system that could
solve complex problems..
Knowledge Representation
A logic expression (e.g., ~ P → (Q v ~ P) ; read as
“not P implies Q or not P”) is represented in the Logic
Theorist as a hierarchy of elements and sub-elements.
The main connective (here →) is the main element.
Other elements include the left (~ P) and right sub-
elements. In this expression the right sub-element is a
sub-expression, (Q v ~ P), which has its own main and
sub-elements. Each element (E) in an expression
contains up to 11 attributes, including:
• the number of negation signs before E,
• the connective in E (if any),
• the name of the variable or sub-expression in E (if
any),
• E’s position in the expression, and
• the location of the expression containing E in storage
memory.
The Logic Theorist contains two kinds of
memories, “working memories” for temporary storage
and “storage memory” for longer-term storage. A single
working memory holds a single element and its
attributes while solving a single problem. Usually one to
three working memories are used. Storage memory is
used for storage of expressions (e.g., axioms and proved
theorems) across problems, and for temporary storage of
expressions and sets of elements while solving a single
problem. Storage memory consists of lists, with each list
containing a whole expression or a set of elements. Each
list has a location label that is used to index the list from
working memories.
Information Processes
The lowest-level unit in the Logic Theorist’s
information processes is an “instruction.” An example
instruction is: “find the right sub-element of the
expression in working-memory 1 and put this sub-
element in working-memory 2”. Instructions can also
shift control of processing to other instructions, using a
branching technique similar to “goto” statements in a
computer program.
The next-highest level in the Logic Theorist’s
information processes is “elementary processes.” Each
elementary process is a sequentially-executed list of
instructions and their associated control flow that
achieves a specific goal. Elementary processes are
similar to “routines” in a computer program or methods
in a GOMS model (Card, Moran & Newell, 1983).
The next-highest level in the Logic Theorist’s
information processes is “methods.” Each method is a
sequentially-executed list of elementary processes, along
with associated control flow. There are four main
methods in the Logic Theorist, each instantiating a
heuristic for proving logic theorems. The methods are:
• Substitution – this method seeks to transform one logic
expression (e.g., the theorem to be proved) into
another (e.g., an axiom) via a series of logically-valid
substitutions of variables and replacements of
connectives
• Detachment – this method implements the logical
inference rule of modus ponens, that is, if the goal is to
prove theorem B and the method can prove the
theorems A → B and A, then B is a proven theorem. If
the goal is to prove theorem B, the detachment method
first attempts to finds a proved theorem in storage
memory of the form A → B where the right side either
matches B or can be made to match B by substitution.
If successful, a sub-goal is set to prove theorem A. If
A is not in the list of proved theorems, the detachment
method attempts to prove A by substitution.
• Chaining forward – this method implements the
transitive rule: if A → B and B → C, then A → C. If
the goal is to prove A → C, this method first searches
for a theorem of the form A → B (or which can be
transformed into A → B by substitution). If successful,
the method then attempts to prove B → C by
substitution.
• Chaining backward – in a similar manner, this method
attempts to prove A → C by first proving B → C, and
then A → B.
The highest-level information process in the Logic
Theorist is the executive control method. This method
applies the substitution, detachment, forward chaining,
and backward chaining methods, in turn, to each
proposed theorem.
Newell, Shaw and Simon (1958) saw the Logic
Theorist as an example of a program composed from
primitive information processes that could generate (or
perform) a complex behavior. They also pointed out that
information processing programs such as the Logic
Theorist offer explanations of the cognitive control
structures and processes underlying complex human
behavior.
Performance
In one test, the Logic Theorist was started with the
axioms of propositional logic in its storage memory and
then presented with 52 theorems to prove from Chapter
2 of Principia Mathematica. The theorems were
presented in the same order as in the book. Upon
proving a theorem, the Logic Theorist added it to its
storage memory for use in later proofs. Given these
constraints, the Logic was able to prove 73% of the 52
theorems. Using a computer that took 30 ms per
elementary information process, half of the theorems
were proved in less than 1 minute, and most in less than
5 minutes.
EVALUATION OF THE LOGIC THEORIST
In the rest of this paper, I will evaluate the Logic
Theorist by considering whether it is an artificial
intelligence (AI) program or a cognitive simulation, and
by assessing its immediate and long-term impacts on
theory and models in cognitive psychology.
AI program or cognitive simulation?
In an article in the Psychological Review in 1958,
Newell, Shaw and Simon pointed out that the elementary
information processes in the Logic Theorist were not
modeled after human thinking, and that the model was
not shaped by fitting to quantitative human data. Also,
the branching control structure and the list-based
knowledge representation of the Logic Theorist were
later determined to be psychologically implausible.
These considerations support the conclusion that the
Logic Theorist does not simulate human cognitive
processes, and therefore, given its intelligent behavior, is
an AI program.
On the other hand, the higher-level information
processes in the Logic Theorist – the methods
instantiating the four heuristics – were explicitly
modeled after the introspective protocols of Simon and
Newell themselves. Newell and Simon explicitly claim
that heuristics are a good way to model the quick but
error-prone nature of human problem solving, and they
used heuristics to model other kinds of problems solving
(e.g., chess) around this time. In their 1958
Psychological Review article, Newell et al. point out a
number of other similarities in how people and the Logic
Theorist solve logic problems – e.g., both generate sub-
goals, and both learn from previously solved problems.
These considerations suggest that in terms of higher-
level information processes such as heuristics,
subgoaling, and learning, the Logic Theorist was a
simulation of human cognition.
Immediate Impact of the Logic Theorist
The completed Logic Theorist was initially
presented to the research world on September 11, 1956
at a star-studded conference, the Second Symposium on
Information Theory at MIT. In addition to Newell’s
presentation, Noam Chomsky presented his ideas on
transformational grammar, George Miller discussed
limitations in short-term memory, and John Swets
applied signal detection theory to perceptual recognition.
Miller has called this day the “moment of conception” of
cognitive science (2003, p. 142).
Other evidence for the impact of the Logic
Theorist on other researchers is contained in Miller,
Galanter and Pribram’s book, Plans and the Structure of
Behavior (1960), which was itself a seminal early work
in cognitive psychology. This book outlines a theory of
how people use plans – structured knowledge – to guide
behavior, and it sketches out a formal, computer-
program-like mechanism for plans based on test-operate-
test-exit units. Thus, in a sense Plans was a
generalization of the ideas that Newell, Simon and Shaw
had actually implemented in the Logic Theorist. The
following excerpts from Plans demonstrate the strong
influence of the Logic Theorist on Miller et al.’s ideas.
• “The first intensive effort to meet this need … to
simulate the human chess player or logician … was the
work of Newell, Shaw and Simon (1956), who have
advanced the business of psychological simulation
further than anyone else” (p. 55)
• Referring to the use of a formalized program to solve
complex problems, Miller et al. praise Newell and
Simon’s “demonstration that what so many have long
described has finally come to pass.”
Miller et al. agreed with the emphasis of Newell
and Simon on heuristics as a general method for
modeling human problem solving, and they describe two
other heuristics used by Newell and Simon – means-ends
analysis and simplification (constraint relaxation) – in
work that led up their General Problem Solver.
Finally, Miller et al. anticipated and gave answers
to some of the common criticisms of cognitive
simulations – criticisms that are still relevant today. The
first criticism is that cognitive simulations are too
complex and have too many parameters to qualify as a
valid, general model of behavior. Miller et al. reply that
“if the description is valid … the fact that it is
complicated can’t be helped. No benign and
parsimonious deity has issued us an insurance policy
against complexity.” While later cognitive modelers
have agreed that parsimony is not the most important
criterion for evaluating models of complex thinking
(Anderson, 1983), they have also tried to reduce the
number of free parameters in their models by using
consistent parameter estimates across models based on
empirical research in cognitive psychology (Card et al.,
1983; Kieras, Wood & Meyer, 1997).
The second criticism of cognitive simulations
anticipated by Miller et al. is the homunculus problem,
i.e., that cognitive simulations may need to posit a smart
but unexplained mental process to interpret mental
representations and make decisions. Miller et al.’s
response to this is that cognitive simulations solve
complex problems without a homunculus, using only
decision-making processes that are explicit and evident
in their rules and heuristics. The third criticism focuses
on how cognitive simulations are to be validated. Miller
et al. suggest Newell and Simon’s main validation
technique, verbal and behavioral protocols, as one way
of validating simulations. Later, proponents of cognitive
simulation developed other kinds of data to validate
models against, including human response times, error
rates, and eye movements.
Long-Term Impact of the Logic Theorist
In a review of the construct of mental
representation in cognitive psychology, Markman and
Dietrich (2000) describe the classical view of
representation in the same way that Newell and Simon
did for the Logic Theorist – i.e., that information
processing consists of transforming amodal mental
symbols so as to guide the organism’s action. Newell
and Simon were key figures in developing the classical
view of representation, which is still followed in a
number of cognitive modeling systems, including
GOMS (Card et al., 1983), ACT-R (Anderson &
Lebiere, 1998), SOAR (Newell, 1990), and EPIC
(Meyer & Kieras, 1997).
BEYOND THE LOGIC THEORIST
In the 1960s, Newell and Simon continued their
work on information processing programs for complex
problem solving. This work was published in their book
Human Problem Solving in 1972. In the early 1970’s,
Newell initiated the use of production systems as an
alternative to the branching control structure of the
Logic Theorist (Newell, 1973). The modular nature of
productions is now felt by many to be a better
description of human procedural knowledge (e.g.,
Anderson & Lebiere, 1998) and production systems are
widely used in cognitive modeling architectures.
In addition to the switch from branching control
structures to production systems, cognitive modelers
have also updated a number of the other techniques used
in the Logic Theorist. In the 1990’s, some cognitive
modelers integrated sub-symbolic, connectionist
processes into classical symbol-processing models. For
example, ACT-R now conditions retrieval of
information from declarative memory on the flow of
activation in a memory network with varying strengths
of associations among nodes. Also in the 1990’s, when
creating the EPIC modeling architecture, Meyer and
Kieras (1997) integrated perceptual and motor processes
into the heretofore purely cognitive architectures.
Other changes in cognitive modeling architectures
are still in progress. These include shifting from amodal
symbols to symbols that include sensori-motor
representations (e.g., Barsalou, 1999), and integrating
emotional and stress responses into cognitive models.
However, Newell and Simon’s demonstration in
the Logic Theorist that an information processing
program could manipulate symbols so as to perform
complex problem-solving tasks is still reflected in
current cognitive modeling. Also, their use of heuristics
as the core of these information processing programs is
still very influential.
REFERENCES
Anderson, J. (1983). The Architecture of Cognition.
Cambridge, MA: Harvard University Press.
Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S.,
Lebiere, C., & Quin, Y. (2004). An integrated theory of
the mind. Psychological Review, 111, 1036-1060
Anderson. J. & Lebiere, C. (1998). The atomic components of
thought. Mahwah, NJ: Erlbaum.
Barsalou, L. (1999). Perceptual symbol systems. Behavioral
and Brain Sciences, 22, 577-660.
Bruner, (1983). In Search of Mind: Essays in Autobiography.
New York, NY: Harper & Row.
Card, S., Moran, T. & Newell, A. (1983). The Psychology of
Human Computer Interaction. Mahwah, NJ: Erlbaum.
Gödel, K. (1931). On formally undecidable propositions of
Principia Mathematica and related systems. I Monatshefte
für Mathematik und Physik, 38, 173-198.
Kieras, D., Wood, S. & Meyer, D. (1997). Predictive
engineering models based on the EPIC architecture for a
multimodal high-performance human-computer
interaction task. ACM Transactions on Computer-Human
Interaction, 4(3). p. 230- 275.
Markman, A. & Dietrich, E. (2000). Extending the classical
view of representation. TRENDS in Cognitive Sciences,
4(12), 470-475.
McCulloch, W. & Pitts, W. (1943) A logical calculus of the
ideas immanent in nervous activity. Bulletin of
Mathematical Biophysics, 5, 115-130.
Meyer, D. & Kieras, D. (1997). A computational theory of
executive cognitive processes and multiple-task
performance: I. Basic mechanisms. Psychological Review,
104(1), 3-65.
Miller, G. A (2003). The cognitive revolution: A historical
perspective. TRENDS in Cognitive Sciences, 7(3), 141-
144.
Miller, G., Galanter, E. & Pribram, K. (1960). Plans and the
Structure of Behavior. New York, NY: Henry Holt and
Company.
Newell, A. (1973). Production systems: Models of control
structures. In W. Chase (Ed.), Visual Information
Processing, Oxford, England: Academic.
Newell, A. (1990).Unified Theories of Cognition. Cambridge,
MA: Harvard University Press.
Newell, A. & Simon, H. (1956). The logic theory machine: A
complex information processing system. IRE
Transactions on Information Theory, 2, 61-79.
Newell, A., Shaw, J. C. & Simon. H. A. (1958) Elements a
theory of human problem solving. Psychological Review,
65(3), 151-166.
Shannon, C. E. (1948). A mathematical theory of
communication, Bell System Technical Journal, 27 (July
and October), pp. 379-423 and 623-656.
Simon H. (1996). Models of My Life. Cambridge, MA: MIT
Press.
Turing, A. (1936). On Computable Numbers, with an
application to the Entscheidungsproblem, Proceedings of
the London Mathematical Society, 2, 230-265.
Turing, A. (1950). Computing machinery and intelligence,
Mind, 59(236), 433-460.
Waldrop, M. M. (2001). The Dream Machine. New York, NY:
Penguin Group.
Whitehead, A. & Russell, B. (1910). Principia Mathematica.
Cambridge, UK: Cambridge University Press.