# Paul WerbosNational Science Foundation | NSF

Paul Werbos

BA, M.Sc., S.M, PhD

## About

167

Publications

62,944

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

16,970

Citations

## Publications

Publications (167)

Emergent phenomena
self-organization
life as entropy
mind as entropy
approximation theory

This paper presents a new position on the fundamental questions about quantum measurement, consciousness, and soul, which contradicts at least one sacred assumption of every major player in today’s debates. It starts from a first person viewpoint, which may be characterized as Rational German existentialism, in the spirit of Goethe, Von Neumann, He...

This paper presents a new position on the fundamental questions about quantum measurement, consciousness and soul, which contradicts at least one sacred assumption of every major player in today's debates. It starts from a first person viewpoint, which may be characterized as Rational German existentialism, in the spirit of Goethe, Von Neumann, Hei...

New algorithms and hardware technology offer possibilities for the pre-detection of terrorism far beyond even the imagination and salesmanship of people hoping to apply forms of deep learning studied in the IEEE Computational Intelligence Society (CIS) decades ago. For example, new developments in Analog Quantum Computing (AQC) give us a concrete p...

This paper addresses two fundamental questions: (1) Is it possible to develop mathematical neural network models which can explain and replicate the way in which higher-order capabilities like intelligence, consciousness, optimization, and prediction emerge from the process of learning (Werbos, 1994, 2016a; National Science Foundation, 2008)? and (...

The preface to this book raises several questions which are important to our understanding of the brain, in specific terms, and to the larger question of the strategy we use to try to develop a more complete understanding in the future: 1. How can we explain Freeman’s empirical observation that the cerebral
cortex regularly undergoes abrupt shifts,...

This paper discusses what will be necessary to achieve the full potential capabilities of analog quantum computing (AQC), which is defined here as the enrichment of continuous-variable computing to include stochastic, nonunitary circuit elements such as dissipative spin gates and address the wider range of tasks emerging from new trends in engineer...

This paper provides new stability results for Action-Dependent Heuristic Dynamic Programming (ADHDP), using a control algorithm that iteratively improves an internal model of the external world in the autonomous system based on its continuous interaction with the environment. We extend previous results for ADHDP control to the case of general multi...

It has been proven that universal quantum computers based on qubits and classical analog networks both have superTuring capabilities. It is a grand challenge to computer science to prove that the combination of the two, in analog (continuous variable) quantum computing, offers supersuperTuring capability, the best we can achieve. Computing with con...

This paper gives highlights of the history of the neural network field, stressing the fundamental ideas which have been in play. Early neural network research was motivated mainly by the goals of artificial intelligence (AI) and of functional neuroscience (biological intelligence, BI), but the field almost died due to frustrations articulated in th...

Depending on the outcome of the triphoton experiment now underway, it is possible that the new local realistic Markov Random Field (MRF) models will be the only models now available to correctly predict both that experiment and Bell's theorem experiments. The MRF models represent the experiments as graphs of discrete events over space-time. This pa...

The most powerful form of quantum learning system possible would somehow
learn the parameters W of a quantum system f(X, W), for f representing the
largest, most powerful set of possible input-output relations. This paper
addresses the issue of how to enlarge the set represented by f, by using a new
formulation of time-symmetric physics to model an...

This paper defines and discusses Mouse Level Computational Intelligence
(MLCI) as a grand challenge for the coming century. It provides a specific
roadmap to reach that target, citing relevant work and review papers and
discussing the relation to funding priorities in two NSF funding activities:
the ongoing Energy, Power and Adaptive Systems progra...

A previous paper demonstrated that two local, realistic models based on
Markov Random Fields (MRF) across space time do replicate the correct, tested
predictions of quantum mechanics for simple Bell Theorem experiments. This
paper demonstrates a third such model, MRF3, which is more plausible
physically, making contact with the physics of polarizer...

This paper begins by reviewing the general form of the Glauber-Sudarshan P
mapping, a cornerstone of coherence theory in quantum optics, which defines a
two-way mapping between ensembles of states S of any classical Hamiltonian
field theory and a subset of the allowed density matrices {\rho} in the
corresponding canonical bosonic quantum field theo...

This paper shows that it is sometimes possible for a simple lumped parameter
model of a circuit to yield correct quantum mechanical predictions of its
behavior, even when there is quantum entanglement between components of that
circuit. It addresses a simple but important example, the circuit of the
original Bell's Theorem experiments for ideal pol...

This paper provides new stability results for Action-Dependent Heuristic
Dynamic Programming (ADHDP), using a control algorithm that iteratively
improves an internal model of the external world in the autonomous system based
on its continuous interaction with the environment. We extend previous results
by ADHDP control to the case of general multi-...

This chapter begins with a review and assessment of four key frontiers for the fields of memristors, neural networks and chaos: (1) use of learning architectures to expand the possible markets for dense memristor chips, crucial to applications such as power grid intelligent enough to improve the economics of renewable energy; (2) advanced modeling...

Many new formulations of reinforcement learning and approximate dynamic programming (RLADP) have appeared in recent years, as it has grown in control applications, control theory, operations research, computer science, robotics, and efforts to understand brain intelligence. This chapter reviews the foundations and challenges common to all these are...

This paper will describe how fuzzy logic, neural networks and other fundamental approaches to the nature of knowledge and epistemology fit together, both at a philosophical level and at the level of practical technology. The views herein are my own, but the bulk of the credit really belongs to Lotfi Zadeh and to the unusual, rich dialogue he has cr...

There has been a huge explosion of interest in the memristor since the first experimental confirmation by HP in 2008 (Strukov et al., Nature 453:80–83, 2008). Because the memristor and its variants provide a huge increase in memory density, compared with existing technologies like flash memory , many of us expect that they will move very quickly to...

Large-scale networks with hundreds of thousands of variables and constraints are becoming more and more common in logistics, communications, and distribution domains. Traditionally, the utility functions defined on such networks are optimized using some variation of Linear Programming, such as Mixed Integer Programming (MIP). Despite enormous progr...

Hard core neural network research includes development of mathematical models of cognitive prediction and optimization aimed at dual use, both as models of what we see in brain circuits and behavior, and as useful general-purpose engineering technology. The pathway and principles now exist to let us someday replicate learning abilities as elevated...

Optimization in large-scale networks - such as large logistical networks and electric power grids involving many thousands of variables - is a very challenging task. In this paper, we present the theoretical basis and the related experiments involving the development and use of visualization tools and improvements in existing best practices in mana...

There has been important new cross-disciplinary work using neural network mathematics to unify key issues in engineering, technology, psychology and neuroscience - and many opportunities to create a discrete revolution in science by pushing this work further. This strain of research has a natural link to clinical and subjective human experience - t...

This paper reviews the evolution of four generations of concepts of the “smart grid,” the role of computational intelligence in meeting their needs, and key examples of relevant research and tools. The first generation focused on traditional concepts like building more wires, automated meters, workforce development, and reducing blackouts, but it a...

The underlying dynamics (\partialt{\psi}=iH{\psi}) of quantum electrodynamics are symmetric with respect to time (T and CPT), but traditional calculations and designs in electronics and electromagnetics impose an observer formalism or causality constraints which assume a gross asymmetry between forwards time and backwards time. In 2008, I published...

Brain-Like Stochastic Search (BLiSS) refers to this task: given a family of utility functions U(u,A), where u is a vector of parameters or task descriptors, maximize or minimize U with respect to u, using networks (Option Nets) which input A and learn to generate good options u stochastically. This paper discusses why this is crucial to brain-like...

Cellular Neural Network (CNN) chips containing a thousand times as many processors as conventional programmable chips can offer a huge improvement in computational throughput, for those applications they are able to address. The artificial neural network (ANN) community has developed new learning designs and topologies, consistent with CNN, which c...

This paper revisits the core issues of space policy from the viewpoint of optimal decision theory. First it argues for a metric: maximizing the probability that humans and their technology in space someday reach what Rostow called the “economic takeoff” point where autonomous growth becomes possible, not bound by the rate of growth on earth. Next i...

Certain key features of brain-like intelligence are essential to fulfill the main goals of policy-makers and environmentalists for the ldquosmart gridrdquo - a key item in the new economic stimulus law, and a key item in a rational strategy for energy sustainability. This paper will explain why and how, and how the neural network community could pl...

This paper presents a theory of how general-purpose learning-based intelligence is achieved in the mammal brain, and how we can replicate it. It reviews four generations of ever more powerful general-purpose learning designs in Adaptive, Approximate Dynamic Programming (ADP), which includes reinforcement learning as a special case. It reviews empir...

The classic “Bell’s Theorem” of Clauser, Holt, Shimony and Horne tells us that we must give up at least one of: (1) objective
reality (aka “hidden variables”); (2) locality; or (3) time-forwards macroscopic statistics (aka “causality”). The orthodox
Copenhagen version of physics gives up the first. The many-worlds theory of Everett and Wheeler give...

This forward to the special issue on adaptive dynamic programming (ADP) and reinforcement learning in feedback control is written by Paul Werbos, the founder of ADP.

At present, there exists no physically plausible example of a quantum field theory for which the existence of solutions has been proven mathematically. The Clay Mathematics Institute has offered a prize for proving existence for a class of Yang-Mills theories defined by Jaffe and Witten. This paper proposes a multi-stage strategy for proving existe...

This paper addresses the question: how can we minimize the expected time between now and the time when we achieve three measures of sustainability and security together -- independence from oil in cars and trucks, very deep reductions in greenhouse gas emissions and deep reductions in natural gas for electricity? Specific new technologies and metri...

Cellular simultaneous recurrent neural network (SRN) has been shown to be a function approximator more powerful than the multilayer perceptron (MLP). This means that the complexity of MLP would be prohibitively large for some problems while SRN could realize the desired mapping with acceptable computational constraints. The speed of training of com...

The National Science Foundation has identified a new thrust area in Quantum,
Molecular and High Performance Modeling and Simulation for Devices and Systems
(QMHP) in its core program. The main purpose of this thrust area is to capture
scientific opportunities that result from new fundamental cross-cutting
research involving three core research comm...

Many have argued that research on grand unification or local realistic physics will not be truly relevant until it makes predictions verified by experiment, different from the prediction of prior theory (the standard model). This paper proposes a new strategy (and candidate Lagrangians) for such models; that strategy in turn calls for reconsiderati...

Cellular simultaneous recurrent neural networks (SRN) show great promise in solving complex function approximation problems. In particular, approximate dynamic programming is an important application area where SRNs have significant potential advantages compared to other approximation methods. Learning in SRNs, however, proved to be a notoriously d...

Since the 1960's the author proposed that we could understand and replicate the highest level of intelligence seen in the brain, by building ever more capable and general systems for adaptive dynamic programming (ADP) - like "reinforcement learning" but based on approximating the Bellman equation and allowing the controller to know its utility func...

Since the 1960s I proposed that we could understand and replicate the highest level of intelligence seen in the brain, by building ever more capable and general systems for adaptive dynamic programming (ADP), which is like reinforcement learning but based on approximating the Bellman equation and allowing the controller to know its utility function...

Mathematical tools related to coherence theory and classical-quantum equivalence, due to Wigner and Glauber, are essential to modern, practical and empirical understanding of electromagnetics in areas like quantum optics and nanoelectronics. This paper specifies how an extension of these same tools (especially Glauber's "Q" mapping) can be applied...

Backwards calculation of derivatives – sometimes called the reverse mode, the full adjoint method, or backpropagation – has
been developed and applied in many fields. This paper reviews several strands of history, advanced capabilities and types
of application – particularly those which are crucial to the development of brain-like capabilities in i...

Cellular simultaneous recurrent neural network has been suggested to be a function approximator more powerful than the MLP's, in particular for solving approximate dynamic programming problems. The 2D maze navigation has been considered as a proof-of-concept task. Present work improves the previous results by training the network with extended Kalm...

[Science refused to send for peer-review]. The term 'Genetics' was not coined until 1905 (G1), but the pre-classical era of Genetics began with Mendel 40 years earlier (1865, G2). The age of classical genetics in the first half of the 20 th Century yielded to the Modern era of molecular genetics by discovery of the structure of DNA (1953, G3). Then...

This paper suggests that traditional fermi-bose quantum field theories (QFT) in 3+1-D, like the standard model of physics, may often be exactly equivalent to the limiting case of a family of bosonic QFT (BQFT) which generate soliton solutions and are "finite." They are "finite" in the sense of being well-defined mathematically even without regulari...

Traditional discussions of the Second Law of Thermodynamics studied the limits of very specific types of devices, such as heat engines, chemical reactions, and molecules channeled by valves. Allahverdyan and Nieuwenhuizen (cond-mat/0110422) have come the closest to proving a more general form of the Second Law, applicable to field effect devices --...

This paper proposes an approach to framing and answering fundamental questions about consciousness. It argues that many of the more theoretical debates about consciousness, such as debates about "when does it begin?", are misplaced and meaningless, in part because "consciousness" as a word has many valid and interesting definitions, and in part bec...

This paper summarizes and re-evaluates Prigogine's evolution of thought, from a more classical view of thermodynamics with negative implications for the evolution and persistence of life, through to a far more general and open formulation of thermodynamics, which does not require assumption of a Big Bang. The paper also proposes an encoding scheme...

Quantum Field Theory (QFT) makes predictions by combining two sets of assumptions: (1) quantum dynamics, such as a Schrodinger or Liouville equation; (2) quantum measurement, such as stochastic collapse to an eigenfunction of a measurement operator. A previous paper defined a classical density matrix R encoding the statistical moments of an ensembl...

Quantum Field Theory (QFT) makes predictions by combining assumptions about (1) quantum dynamics, typically a Schrodinger or Liouville equation; (2) quantum measurement, usually via a collapse formalism. Here I define a "classical density matrix" rho to describe ensembles of states of ordinary second-order classical systems (ODE or PDE). I prove th...

It is well known that classical systems governed by ODE or PDE can have extremely complex emergent properties. Many researchers have asked: is it possible that the statistical correlations which emerge over time in classical systems would allow effects as complex as those generated by quantum field theory (QFT)? For example, could parallel computat...

Einstein conjectured long ago that much of quantum mechanics might be derived as a statistical formalism describing the dynamics of classical systems. Bell's Theorem experiments have ruled out complete equivalence between quantum field theory (QFT) and classical field theory (CFT), but an equivalence between dynamics is not only possible but provab...

Existing methods of complexity research are capable of describing certain specifics of biosystems over a given narrow range of parameters but often they cannot account for the initial emergence of complex biological systems, their evolution, state changes and sometimes-abrupt state transitions. Chaos tools have the potential of reaching to the esse...

Existing methods of complexity research are capable of describing certain specifics of bio systems over a given narrow range of parameters but often they cannot account for the initial emergence of complex biological systems, their evolution, state changes and sometimes-abrupt state transitions. Chaos tools have the potential of reaching to the ess...

The first breakthrough success with reconfigurable flight control (RFC) was based on a form of neurodynamic programming. Some RFC simulations depend on highly unrealistic assumptions, like the implicit assumption that an airplane will not change its angle of attack by even one degree after being hit by a missile, or that there is only a small set o...

The classic paper of Clauser et al proved that Bell's Theorem experiments rule out all theories of physics which assume locality, time-forwards causality and the existence of an objective real world. The Backwards-Time Interpretation (BTI) tries to recover realism and locality by permitting backwards time causality. BTI should permit dramatic simpl...

Years ago, many researchers proposed that chaos theory could
provide a kind of universal theory of the qualitative behavior of all
dynamical systems. Thus it could provide a solid mathematical foundation
for unifying efforts to address the three really fundamental questions
of basic science: (1) what is life?; (2) what is mind or intelligence?;
(3)...

Adaptive critic designs (ADC) are sometimes called reinforcement
learning systems, approximate dynamic programming or neurodynamic
programming. Applications to cars and missiles confirm their advantages
over older methods and their ability to overcome the “curse of
dimensionality” in dynamic programming. Some ACDs can be
formulated as new designs f...

The sections in this article are

High-energy physicists already know that stable attractors (solitons) can exist in 3+1-dimensional conservative Lagrangian systems, so long as the definition of an attractor is based on weak notions of stability and the fields admit topological charge. This paper explores the possibility of attractors in Lagrangian field theories without topologica...

Classical adaptive control proves total-system stability for control of linear plants, but only for plants meeting very restrictive assumptions. Approximate Dynamic Programming (ADP) has the potential, in principle, to ensure stability without such tight restrictions. It also offers nonlinear and neural extensions for optimal control, with empirica...

Many researchers think that neurocontrollers should never be used
in real-world applications until firm, unconditional stability theorems
for them have been established. This paper explains key ideas from the
author's previous paper (1998) which discusses the problem of
“universal stability” (in the linear care) and proposes a
new solution. New for...

Classical adaptive control proves total-system stability for control of
linear plants, but only for plants meeting very restrictive assumptions.
Approximate Dynamic Programming (ADP) has the potential, in principle, to
ensure stability without such tight restrictions. It also offers nonlinear and
neural extensions for optimal control, with empirica...

This paper shows that a new type of artificial neural network (ANN) -- the Simultaneous Recurrent Network (SRN) -- can, if properly trained, solve a difficult function approximation problem which conventional ANNs -- either feedforward or Hebbian -- cannot. This problem, the problem of generalized maze navigation, is typical of problems which arise...

This paper provides mathematical details related to another new paper which suggests: (1) new approaches to the analysis of soliton stability; (2) families of Lagrangian field theories where solitons might possibly exist even without topological charge; (3) alternative approaches to quantizing solitons, with testable nuclear implications. This pape...

This paper briefly summarizes and cites new work which tries to
bridge the gap between advanced neural network designs and some of the
known capabilities of mammalian intelligence. The new design draws
heavily on concepts of hierarchy and temporal chunking from artificial
intelligence, and on relational representation of objects and
space

This paper briefly summarizes and cites work which tries to bridge
the gap, between advanced neural network designs and some of the known
capabilities of mammalian intelligence. The new design draws heavily on
concepts of hierarchy and temporal chunking from AI, and on relational
representation of objects and space

Summary form only given. This paper summarizes progress in neurocontrol (neural networks for control), and provides a strategy to fill in the remaining gap between neurocontrol and large scale challenges such as the factory management challenge discussed by Albus and Meystel, with links to neuroscience. Neurocontrol has progressed in three main are...

This paper will show that a new neural network design can solve an example of difficult function approximation problems which are crucial to the field of approximate dynamic programming(ADP). Although conventional neural networks have been proven to approximate smooth functions very well, the use of ADP for problems of intelligent control or planni...

Previous papers have explained why model-based adaptive critic
designs-unlike other designs used in neurocontrol-have the potential to
replicate some of the key, basic aspects of intelligence as seen in the
brain. However, these designs are modular designs, containing
“simple” supervised learning systems as modules. The
intelligence of the overall...

Summary form only given, substantially as follows. Problems of
portfolio management have included several fundamental time-series
problems. Parts of these problems are involved with the inevitable
noisiness of financial data, parts with interactions and mode-locking
among measures, and parts with the basic probabilistic nature of
predictive systems...

This paper defines a more restricted class of designs, to be
called “brain-like intelligent control”. The paper explains
the definition and concepts behind it, describes benefits in control
engineering, emphasizing stability, mentions 4 groups who have
implemented such designs, for the first time, since late 1993, and
discusses the brain as a membe...

In addition to demonstrated, intelligible engineering
functionality, a “brain-like” system should contain at least
three major general-purpose adaptive components: (1) an action or motor
system, capable of outputting effective control signals to the plant of
environment; (2) an “emotional” or “evaluation”
system or “critic”, used to assess the long...

greedy policies based on imperfect value functions. Technical Report NU-CCS-93-14, Northeastern University College of Computer Science, 1993. Tight lower bound on cumulative reward obtained from policies based on value functions where Bellman equation is not satisfied exactly. Ian H. Witten. An adaptive optimal controller for discrete-time markov e...

Tremendous progress and new research horizons have opened up in developing artificial neural network (ANN) systems which use supervised learning as an elementary building block in performing more brain-like, generic tasks such as control or planning or forecasting across time [1, 2]. However, many people believe that research into supervised learni...

This paper will present a skeleton outline of new mathematical tools, based on concepts from quantum field theory (QFT), which could open up a new approach to understanding chaotic solitons and other chaotic modes for classical PDE in four dimensions. It will argue that a further development of these tools could lead to a radical reformulation of p...

Chaotic solitons (“chaoitons”) are essentially just the localized attractors generated by PDE. If truly chaotic chaoitons can exist in simple conservative systems, then the potential implications for physics and biology may be very important. Using a simple test for attractors, I find that they can exist there only in systems which appear capable o...

The author shows how elastic fuzzy logic (EFL) nets make it
possible to combine the capabilities of expert systems with the learning
capabilities of neural networks at a high level. ANN (artificial neural
network) implementations have advantages in terms of hardware
implementation, ease of use, generality, and links to the brain, which
is still the...

## Projects

Projects (10)

To show how the forgotten historical underpinning of physics grants a winding path through every experimental goal-post to give a radically different view of the entire picture of physics that can be expressed as merely a shift in perspective on the entire integrated structure.
To show a single pivot point for revolution within the immensely delicate and interlocked constraints of modern theory.