S.E. Dreyfus

S.E. Dreyfus
University of California, Berkeley | UCB · Department of Industrial Engineering and Operations Research

PhD

About

113
Publications
45,449
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
14,431
Citations
Citations since 2016
2 Research Items
4728 Citations
20162017201820192020202120220200400600
20162017201820192020202120220200400600
20162017201820192020202120220200400600
20162017201820192020202120220200400600

Publications

Publications (113)
Article
Dynamic programming is a mathematical optimization algorithm relied upon by time warping procedures. In general, its use is straightforward and early time warping publications are indeed correct and efficient. However, in sophisticated use of this technique considerable art is required. We discuss here some published sophisticated time warping proc...
Article
Full-text available
For solving a sequential decision-making problem in a non-Markovian domain, standard dynamic programming (DP) requires a complete mathematical model; hence, a totally model-based approach. By contrast, this paper describes a totally model-free approach by actor-critic reinforcement learning with recurrent neural networks. The recurrent connections...
Article
We investigate three stochastic problems (linear dynamics quadratic criterion, minimum-cost path, equipment replacement) with time-delayed control dynamics. We show how the concept of ¿stage lookahead¿ helps to reduce the number of arguments in the optimal value function of dynamic programming in order to alleviate the so-called curse of dimensiona...
Article
Full-text available
Influenced by recent neuroscientific research, the author proposes that the cognition underlying creativity should be seen as a sequential process requiring the appropriate interspersing of both intuitive and analytical modes of thought. Each of these modes may concern itself with either identifying the information that is the focus of potentially...
Article
This paper presents an experiment, which builds a bridge over the gap between neuroscience and the analysis of economic behaviour. We apply the mathematical theory of Pavlovian conditioning, known as Recurrent Associative Gated Dipole (READ), to analyse ...
Article
Multi-stage feed-forward neural network (NN) learning with sigmoidal-shaped hidden-node functions is implicitly constrained optimization featuring negative curvature. Our analyses on the Hessian matrix H of the sum-squared-error measure highlight the following intriguing findings: At an early stage of learning, H tends to be indefinite and much bet...
Conference Paper
Full-text available
We analyze the Hessian matrix H of the sum-squared-error measure for multilayer-perceptron (MLP) learning, showing the following intriguing results: At an early stage of learning, H is indefinite. The indefiniteness is related to the MLP structure, which also determines rank of H (per datum). Exploiting negative curvature leads to efficient learnin...
Conference Paper
Full-text available
We present a simple, intuitive argument based on "invariant imbedding" in the spirit of dynamic programming to derive a stagewise second-order backpropagation (BP) algorithm. The method evaluates the Hessian matrix of a general objective function efficiently by exploiting the multistage structure embedded in a given neural-network model such as a m...
Conference Paper
The theory of optimal control is applied to multi-stage (i.e., multiple-layered) neural-network (NN) learning for developing efficient second-order algorithms, expressed in NN notation. In particular, we compare differential dynamic programming, neighboring optimum control, and stagewise Newton methods. Understanding their strengths and weaknesses...
Conference Paper
Full-text available
Recent advances in computer technology allows the implementation of some important methods that were assigned lower priority in the past due to their computational burdens. Second-order backpropagation (BP) is such a method that computes the exact Hessian matrix of a given objective function. We describe two algorithms for feed-forward neural-netwo...
Conference Paper
Full-text available
We describe two stochastic non-Markovian dynamic programming (DP) problems, showing how the posed problems can be attacked by using actor-critic reinforcement learning with recurrent neural networks (RNN). We assume that the current state of a dynamical system is "completely observable", but that the rules, unknown to our decision-making agent, for...
Article
The author proposes a neural-network-based explanation of how a brain might acquire intuitive expertise. The explanation is intended merely to be suggestive and lacks many complexities found in even lower animal brains. Yet significantly, even this simplified brain model is capable of explaining the acquisition of simple skills without developing a...
Article
The following is a summary of the author’s five-stage model of adult skill acquisition, developed in collaboration with Hubert L. Dreyfus. An earlier version of this article appeared in chapter 1 of Mind Over Machine: The Power of Human Intuition and Expertise in the Era of the Computer (1986, Free Press, New York).
Article
We assume that acting ethically is a skill. We then use a phenomenological description of five stages of skill acquisition to argue that an ethics based on principles corresponds to a beginner’s reliance on rules and so is developmentally inferior to an ethics based on expert response that claims that, after long experience, the ethical expert lear...
Conference Paper
Full-text available
We describe how multi-stage non-Markovian decision problems can be solved using actor-critic reinforcement learning by assuming that a discrete version of Cohen-Grossberg node dynamics describes the node-activation computations of neural network (NN). Our NN is capable of rendering the process Markovian implicitly and automatically in a totally mod...
Article
Full-text available
Many have recommended changing the professional development of physicians. Concluding that further educational process specification was inadequate, the Accreditation Council for Graduate Medical Education (ACGME) decided to specify six general competencies of graduate medical education (GME): patient care; medical knowledge; practice-based learnin...
Article
Full-text available
This paper is an invited contribution to the 50th anniversary issue of the journalOperations Research, published by the Institute of Operations Research and Management Science (INFORMS). It describes one person's perspective on the development of computational ...
Conference Paper
Full-text available
Two two-class classification benchmarks, the parity problem and the two-spiral problem, are very difficult to solve using a standard single-hidden-layer MLP when trained with an incremental gradient method (i.e., pattern-by-pattern-mode steepest-descent-type algorithm), often called backpropagation (BP) algorithm. We show that the learning capacity...
Conference Paper
Full-text available
This paper presents the complexity analysis of a standard supervised MLP-learning algorithm in conjunction with the well-known backpropagation, an efficient method for evaluation of derivatives, in either batch or incremental learning mode. In particular, we detail the cost per epoch (i.e., operations required for processing one sweep of all the tr...
Article
Full-text available
This paper addresses saturation phenomena at hidden nodes during the learning phase of neural networks. The hidden-node saturation tends to cause a "plateau," a region of very little or no change in a graphic representation of the error learning curve. We investigate the saturation phenomena in multilayer perceptrons (MLP) with the well-known neura...
Conference Paper
Full-text available
The well-known backpropagation (BP) derivative computation process for multilayer perceptrons (MLP) learning can be viewed as a simplified version of the Kelley-Bryson gradient formula in the classical discrete-time optimal control theory. We detail the derivation in the spirit of dynamic programming, showing how they can serve to implement more el...
Conference Paper
Full-text available
We describe how an actor-critic reinforcement learning agent in a non-Markovian domain finds an optimal sequence of actions in a totally model-free fashion; that is, the agent neither learns transitional probabilities and associated rewards, nor by how much the state space should be augmented so that the Markov property holds. In particular, we emp...
Article
A customized neural network for sensor fusion of acoustic emission and force in on-line detection of tool wear is developed. Based on two critical concerns regarding practical and reliable tool-wear monitoring systems, the maximal utilization of 'unsupervised' sensor data and the avoidance of off-line feature analysis, the neural network is trained...
Article
Since off line hand-crafted conventional statistical feature selection/extraction methods are inefficient for signal-pattern classification of noisy sensor signals in manufacturing floor applications, we provide an automatic neural network approach to feature extraction, called Input Feature Scaling. Given only meaningful examples, the Input Featur...
Chapter
Observers of the political and economic scene note that almost all decisions involve incremental changes from the status quo1. Slightly mitigating the ills we have has always seemed preferable to flying to others that we know not of. It now appears, however, that advanced information technology and improved theoretical understanding of social and e...
Article
One's model of skill determines what one expects from neural network modelling and how one proposes to go about enhancing expertise. We view skill acquisition as a progression from acting on the basis of a rough theory of a domain in terms of facts and rules to being able to respond appropriately to the current situation on the basis of neuron conn...
Article
We develop a neural network for on-line tool wear monitoring in metal cutting environments. After levels of tool wear are topologically ordered by the unsupervised Kohonen's Feature Map, input features from Acoustic Emission and force sensor signals are scaled by an additional supervised learning stage using Input Feature Scaling(IFS) algorithm dev...
Article
ogy, in both its transcendental and existential versions, has made immense contributions to metaphysics, epistemology and the philosophy of action and mind. The same cannot be said of its contribution to ethics. With the exception of Sartre, phenomenologists have had little to say about ethics, and what Sartre has said has had little effect on the...
Article
Once all of the cases to be learned in a neural-net mapping problem are concatenated into one large network with a vector output, one component for each case, a standard discrete-time optimal-control problem results. The Kelley-Bryson gradient formulas for such problems have been rediscovered by neural-network researchers and termed back propagatio...
Chapter
This outstanding collection is designed to address the fundamental issues and principles underlying the task of Artificial Intelligence. The editors have selected not only papers now recognized as classics but also many specially commissioned papers which examine the methodological and theoretical foundations of the discipline from a wide variety o...
Conference Paper
Designers of knowledge-based systems assume that skilled human beings, coping with an environment, use heuristic rules to map facts about a situation into responsive actions. Knowledge engineers try to facilitate the articulation of these assumed rules in order to implement them on a digital computer.
Article
a2 Department of Industrial Engineering and Operations Research, University of California, Berkeley, Calif. 94720
Article
In the early 1950s, as calculating machines were coming into their own, a few pioneer thinkers began to realise that digital computers could be more than number-crunchers. At that point two opposed visions of what computers could be, each with its correlated research programme, emerged and struggled for recognition. One faction saw computers as a s...
Article
This paper examines the general epistemological assumptions of artificial intelligence technology, and recent work in the development of expert systems. These systems are limited because of a failure to recognize the real character of expert understanding, which is acquired as the fifth step of a five-step process. A review of the successes and fai...
Article
It is argued that we are in danger of becoming a society that confuses computer-type rationality with true expertise. The authors think that trying to capture more sophisticated skills within the realm of logic (skills involving not only calculation but also judgment) is a dangerously misguided effort and ultimately doomed to failure. They discuss...
Article
Scientists who stand at the forefront of artificial intelligence (AI) have long dreamed of autonomous 'thinking' machines that are free of human control. And now they believe we are not far from realizing that dream. Encouraged by such optimistic pronouncements, the U. S. Department of Defense (DOD) is investing millions of dollars into developing...
Conference Paper
In this paper we present a type inference method for Prolog programs. The new idea is to describe a superset of the success set by associating a type substitution (an assignment of sets of ground terms to variables) with each head of definite clause. ...
Chapter
Most mathematical models in management science and symbol-manipulating programs in artificial intelligence attempt to describe the relevant problematic world in terms of facts, decisions or actions taken in the present and often also in the future, and relationships specifying how facts and decisions combine to generate new facts. Alternative decis...
Article
Careful observation of the skill acquisition process in management, policy analysis and elsewhere shows a progression from analytical understanding in terms of decomposed parts towards holistic understanding based on intuitively perceived similarity with previously experienced situations. Since experts rarely think in terms of decomposed elements o...
Article
Full-text available
Possible limitations on the successful formal modeling of human expertise can only be identified if the evolving thought processes involved in acquiring expertise are understood. This paper presents a 5-stage description of the human skill-acquisition process, applies it to the skill of business management, and draws conclusions about potential use...
Article
Full-text available
In acquiring a skill by means of instruction and experience, the student normally passes through five developmental stages which we designate novice, competence, proficiency, expertise and mastery. We argue, based on analysis of careful descriptions of skill acquisition, that as the student becomes skilled, he depends less on abstract principles an...
Article
Full-text available
Three models of skill acquisition are proposed: (1) Nonsituational, (2) Intermediate, and (3) Situational. It is argued that only the third can account for highly skilled performance. The type of emergency training program each suggests and the level of pilot performance that each can be expected to produce is then investigated. We conclude that on...
Chapter
This paper argues that for various behavioral and ethical reasons decision analysis does not, and in some cases, cannot, treat all the considerations that enter a human decision. Those considerations that cannot be treated are holistic in nature, and do not allow the decomposition (‘divide and conquer’)([5] p. 271) which makes decision analysis, an...
Article
A gradient computational procedure is developed for discrete-time optimal control problems in which the current cost and rule governing state transitions depend on the entire past history of states and decisions. A condition necessary for optimality is also deduced in two ways and given physical interpretation.
Article
An optimal control problem with linear dynamics and quadratic criterion is imbedded in a family of problems characterized by both initial and terminal points. The optimal value function is jointly quadratic in initial and terminal points, and the optimal control is jointly linear. Recursive formulas for the coefficients of these functions are devel...
Chapter
This chapter presents the main results of optimal control theory. A very special case of the optimal control problem requires the determination of the continuous and piece-wise continuously diffèrentiable function x(t), which yields the minimum value of J. A solution x(t) satisfies the Euler–Lagrange second-order ordinary nonlinear differential equ...
Article
To computationally solve an adaptive optimal control problem by means of conventional dynamic programming, a backward recursion must be used with an optimal value and optimal control determined for all conceivable prior information patterns and prior control histories. Consequently, almost all problems are beyond the capability of even large comput...
Article
An algorithm for solving the Steiner problem on a finite undirected graph is presented. This algorithm computes the set of graph arcs of minimum total length needed to connect a specified set of k graph nodes. If the entire graph contains n nodes, the algorithm requires time proportional to n3/2 + n2 (2k-1 - k - 1) + n(3k-1 - 2k + 3)/2.The time req...
Article
Full-text available
This paper treats five discrete shortest-path problems: (1) determining the shortest path between two specified nodes of a network; (2) determining the shortest paths between all pairs of nodes of a network; (3) determining the second, third, etc., shortest path; (4) determining the fastest path through a network with travel times depending on the...
Article
The conventional dynamic programming method for analytically solving a variational problem requires the determination of a particular solution, the optimal value function or return function, of the fundamental partial differential equation. Associated with it is another function, the optimal policy function. At each point, this function yields the...
Article
This short paper surveys three aspects of dynamic programming. The first part concerns the salient features of the dynamic programming approach to a problem and relates the formalism to more classical techniques. The second section discusses one area of current dynamic programming research, the mathematical rigorization of the results, and the exte...
Article
The conventional dynamic programming method for analytically solving a variational problem requires the determination of particular solution function of the fundamental partial differential equation. This solution function is called the optimal value (or return) function. Associated with it is another function, called the optimal policy function, w...
Article
: The optimal control of stochastic systems is considered. Under various assumptions concerning the information available to the controller, different optimal control rules result. For certain specific problems, the different control schemes are analyzed and compared, and the vast superiority of feedback over open-loop control is demonstrated.
Article
The necessary conditions are presented for an extremal solution to a programming problem with an inequality constraint on a function of the control and/or the state variables. It is shown that, in general, certain terms must be added to the Euler-Lagrange equations during intervals in which the solution curve lies on the boundary. Furthermore, for...
Article
The problem of optimal control is considered as earlier studied separately by Gamkrelidze, Berkovitz, and Dreyfus --wherein a constraint is placed on the state variables. The three have studied the problem from different viewpoints: Gamkrelidze accounts for the constraint by modifying Pontryagin's maximum principle arguments; Berkovitz applies dire...
Article
A theory developing a linear feedback guidance scheme to correct for inflight disturbances of a vehicle during the course of a space mission is presented. The theory is predicated on the use of a nominal optimal trajectory. The scheme consists of a linear combination of: (1) perturbations of the vehicle state from its nominal state; and (2) time-va...

Network

Cited By