David E. Rumelhart's research while affiliated with Stanford University and other places

Publications (104)

Article
Fourteen rhesus monkeys and two human Os were trained to discriminate between identical blocks of wood placed 13 in apart, using cues that were provided by a pointer that was placed at random in positions spaced 1.0 in apart between the manipulanda. Monkeys made increasingly more errors as a function of increasing distance between the manipulandum...
Article
multiple simultaneous constraints parallel distributed processing [PDP] / examples of PDP models representation and learning in PDP models origins of parallel distributed processing (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Article
We discuss the development of a neural network for facial expression recognition. It aims at recognizing and interpreting facial expressions in terms of signaled emotions and level of expressiveness. We use the backpropagation algorithm to train the system to differentiate between facial expressions. We show how the network generalizes to new faces...
Article
Full-text available
n M In this paper we present a hybrid multilayer perceptron (MLP)/hidde arkov model (HMM) speaker-independent continuous-speech recognib tion system, in which the advantages of both approaches are combined y using MLPs to estimate the state-dependent observation probabilities p of an HMM. New MLP architectures and training procedures are resented w...
Article
Full-text available
e present a speaker-independent, continuous-speech recog- ( nition system based on a hybrid multilayer perceptron MLP)/hidden Markov model (HMM). The system come bines the advantages of both approaches by using MLPs to stimate the state-dependent observation probabilities of an e p HMM. New MLP architectures and training procedures ar resented that...
Article
Full-text available
arlier hybrid multilayer perceptron (MLP)/hidden Markov model (HMM) continuous speech recognition sysr g tems have not modeled context-dependent phonetic effects, sequences of distributions for phonetic models, o ender-based speech consistencies. In this paper we present a new MLP architecture and training procedure for t " modeling context-depende...
Article
Full-text available
In this paper we present a training method and a network architecture for estimating contextdependent observation probabilities in the framework of a hybrid hidden Markov model (HMM) / multi layer perceptron (MLP) speaker-independent continuous speech recognition system. The context-dependent modeling approach we present here computes the HMM conte...
Article
We describe a technique for mapping out human somatosensory cortex using functional magnetic resonance imaging (fMRI). To produce cortical activation, a pneumatic apparatus presented subjects with a periodic series of air puffs in which a sliding window of five locations moved along the ventral surface of the left arm in a proximal-to-distal or dis...
Article
Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the "teacher" in supervised learning can be solved by judicious use of learned internal mod...
Article
morphic Extensions to the Relational Model. PhD dissertation, The University of Iowa, Dept. of Computer Science, August 1989. D. Eichmann. A hybrid approach to software repository retrieval: Blending faceted classification and type signatures. In Third International Conference of Software Engineering and Knowledge Engineering, pages 236-240, Skokie...
Article
morphic Extensions to the Relational Model. PhD dissertation, The University of Iowa, Dept. of Computer Science, August 1989. D. Eichmann. A hybrid approach to software repository retrieval: Blending faceted classification and type signatures. In Third International Conference of Software Engineering and Knowledge Engineering, pages 236-240, Skokie...
Article
In this paper we present a training method and a network architecture for estimating context-dependent observation probabilities in the framework of a hybrid hidden Markov model (HMM)/multi layer perceptron (MLP) speaker-independent continuous speech recognition system. The context-dependent modeling approach we present here computes the HMM contex...
Article
Interest in the study of neural networks has grown remarkably in the last several years. This effort has been characterized in a variety of ways: as the study of brain-style computation, connectionist architectures, parallel distributed-processing systems, neuromorphic computation, artificial neural systems. The common theme to these efforts has be...
Article
Just four years ago, the only widely reported commercial application of neural network technology outside the financial industry was the airport baggage explosive detection system developed at Science Applications International Corporation (SAIC). Since that time scores of industrial and commercial applications have come into use, but the details o...
Article
An optimal control theory of story comprehension and recall is proposed within the framework of a “situation”‐state space. A point in situation‐state space is specified by a collection of propositions, each of which can have the values of either “present” or “absent.” A trajectory in situation‐state space is a temporally ordered sequence of situati...
Article
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We outline the algorithm and demonstrate this neural network architecture and algorithm on character recognition using the NIST database and report results he...
Article
Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the %eacher" in supervised learning can be solved by judicious use of learned internal mode...
Conference Paper
Internal models of the environment have an important role to play in adaptive systems in general and are of particular importance for the supervised learning paradigm. In this paper we demonstrate that certain classical problems associated with the notion of the “teacher― in supervised learning can be solved by judicious use of learned internal mod...
Conference Paper
The authors show how the effective number of parameters changes during backpropagation training by analyzing the eigenvalue spectra of the covariance matrix of hidden unit activations and of the matrix of weights between inputs and hidden units. They use the standard example of time series prediction of the sunspot series. The effective ranks of th...
Conference Paper
Inspired by the information theoretic idea of minimum description length, the authors add a term to the backpropagation cost function that penalizes network complexity. The authors give the details of the procedure, called weight-elimination, describe its dynamics, and clarify the meaning of the parameters involved. From a Bayes perspective, the co...
Article
Inspired by the information theoretic idea of minimum description length, we add a term to the usual back-propagation cost function that penalizes network complexity. From a Bayesian perspective, the complexity term can be usefully interpreted as an assumption about prior distribution of the weights. This method, called weight-elimination, is contr...
Conference Paper
We present a neural network algorithm that simultaneously performs segmentation and recognition of input patterns that self-organizes to detect input pattern locations and pattern boundaries. We outline the algorithm and demonstrate this neural network architecture and algorithm on character recognition using the NIST database and report results he...
Article
We have designed a feed-forward neural network to classify low-resolution mass spectra of unknown compounds according to the presence or absence of 100 organic substructures. The neural network, MSnet, was trained to compute a maximum-likelihood estimate of the probability that each substructure is present. We discuss some design considerations and...
Article
mass spectral classification; structure elucidation; neural networks; back propagation We have designed a feed-forward neural network to classify low-resolution mass spectra of unknown compounds according to the presence or absence of 100 organic substructures. The neural network, MSnet, was trained to compute a maximum-likelihood estimate of the p...
Article
Full-text available
We investigate the effectiveness of connectionist architectures for predicting the future behavior of nonlinear dynamical systems. We focus on real-world time series of limited record length. Two examples are analyzed: the benchmark sunspot series and chaotic data from a computational ecosystem. The problem of overfitting, particularly serious for...
Chapter
This chapter reviews and examines a variant type of computational unit which we have recently proposed for use in multi-layer neural networks [3]. Instead of the output of this unit depending on a weighted sum of the inputs, it depends on a weighted product. In justifying the introduction of a new type of unit we explore at some length the rational...
Article
We introduce a new form of computational unit for feedforward learning networks of the backpropagation type. Instead of calculating a weighted sum this unit calculates a weighted product, where each input is raised to a power determined by a variable weight. Such a unit can learn an arbitrary polynomial term, which would then feed into higher level...
Article
Full-text available
This article presents a simulation-based tutorial system for exploring parallel distributed processing (PDP) models of information processing. The system consists of software and an accompanying handbook. The intent of the package is to make the ideas underlying PDP accessible and to disseminate some of the main simulation programs that we have dev...
Article
We describe a new learning procedure, back-propagation, for networks of neurone-like units. The procedure repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector. As a result of the weight adjustments, internal 'hidden' u...
Chapter
This paper presents a generalization of the perception learning procedure for learning the correct sets of connections for arbitrary networks. The rule, falled the generalized delta rule, is a simple scheme for implementing a gradient descent method for finding weights that minimize the sum squared error of the sytem's performance. The major theore...
Article
Full-text available
We describe a distributed model of information processing and memory and apply it to the representation of general and specific information. The model consists of a large number of simple processing elements which send excitatory and inhibitory signals to each other via modifiable connections. Information processing is thought of as the process whe...
Article
Full-text available
Responds to D. Broadbent's (see record 1986-08237-001 ) comments on the present 2nd and 1st authors' (see record 1986-08244-001 ) article on distributed memory. Broadbent concedes that the present authors are probably correct in supposing that memory representations are distributed but argues that psychological evidence is irrelevant to the present...
Article
Responds to D. Broadbent's (see record 1986-08237-001) comments on the present 2nd and 1st authors' (see record 1986-08244-001) article on distributed memory. Broadbent concedes that the present authors are probably correct in supposing that memory representations are distributed but argues that psychological evidence is irrelevant to the present...
Article
This paper reports the results of our studies with an unsupervised learning paradigm which we have called “Competitive Learning.” We have examined competitive learning using both computer simulation and formal analysis and have found that when it is applied to parallel networks of neuron-like elements, many potentially useful learning tasks can be...
Chapter
A common terminology is essential when working in any area, and the study of typing is no exception. To aid ourselves and others, we have compiled a glossary of basic definitions useful in the description of the phenomena of typing. The glossary, which also contains a categorization of errors, has proved useful in several ways. Not only does it kee...
Article
The study of typing comprises a fascinating mixture of elements from motor skills, typewriter mechanics, anatomy, and cognitive control structures. Our research group initially started to study typing because it seemed an ideal example of highly skilled performance, with readily available experimental subjects and, with the advent of computer-contr...
Article
We review the major phenomena of skilled typing and propose a model for the control of the hands and fingers during typing. The model is based upon an Activation-Trigger-Schema system in which a hierarchical structure of schemata directs the selection of the letters to be typed and, then, controls the hand and finger movements by a cooperative, rel...
Article
The interactive activation model of context effects in letter perception is reviewed, elaborated, and tested. According to the model, context aids the perception of target letters as they are processed in the perceptual system. The implication that the duration and timing of the context in which a letter occurs should greatly influence the percepti...
Article
Describes a model in which perception results from excitatory and inhibitory interactions of detectors for visual features, letters, and words. A visual input excites detectors for visual features in the display and for letters consistent with the active features. Letter detectors in turn excite detectors for consistent words. It is suggested that...
Article
This report is the first part of a two-part series introducing an interactive activation model of context effects in perception. In this part, is developed the model for the perception of letters in words and other contexts and a number of experiments in the recent literature is applied. The model is used to account for the perceptual advantage for...
Article
Learning is not a simple unitary process. This paper identifies three qualitatively different phases of the learning process. In one phase, the learner acquires facts and information, accumulating more structures onto the already existing knowledge structures. This phase of learning is adequate only when the material being learned is part of a prev...
Article
Describes development of a model for the recognition of tachistoscopically presented words. It is a "sophisticated guessing" model which takes explicit account of the geometry of the characters which make up the words or letter strings. Explicit attempts are made to account for word frequency effects, effects due to letter transition probabilities,...
Article
A theory of analogical reasoning is proposed in which the elements of a set of concepts, e.g., animals, are represented as points in a multidimensional Euclidean space. Four elements A,B,C,D, are in an analogical relationship A:B::C:D if the vector distance from A to B is the same as that from C to D. Given three elements A,B,C, an ideal solution p...
Conference Paper
Full-text available
Additive AND/OR graphs are defined as AND/ OR graphs without circuits, which can be considered as folded AND/OR trees; i. e. the cost of a common subproblem is added to the cost as many times as the subproblem occurs, but it is computed only once. Additive ...
Article
Describes methods for determining sidedness and eye dominance in infants under 12 wk. of age, in 2-5 yr. olds, and in Ss over 5 yr. of age. The effects of imitation on developing left or right handedness is discussed. Research is noted which indicates the deleterious effects of crossed dominance. It is suggested that those children and adults who a...

Citations

... The self-organizing map (SOM), introduced by Kohonen [85], is an unsupervised machine learning method that performs an ordered mapping of the input data into a lower-dimensional space. Essentially, the SOM is an artificial neural network (ANN) that is trained through a competitive learning framework, i.e., the ANN nodes compete with each other for the right to "respond" to the input data [86]. ...
... After a fixed number of time steps, an activation-weighted sum of all memories is added back to the cell state of the LSTM. regularities can be viewed as an implementation of semantic memory (McClelland and Rumelhart, 1987;McClelland and Rogers, 2003;Rogers and McClelland, 2004;Saxe et al., 2019). ...
... Automated and accurate classification of objects into stars and galaxies from optical (and near infrared) imaging data is an issue of considerable interest. Artificial neural network based approaches to the star galaxy classification problem include SOM (Miller & Coe 1996), decision tree induction (Weir, Fayyad & Djorgovski 1995) and back propagation, which is the basis for SExtractor, a widely used tool for star-galaxy separation (Bertin & Arnouts 1996). One of the drawbacks of classification tools such as SExtractor that employ back propagation is that it is difficult to modify them for specific needs. ...
... Um dos modelos mais utilizados na literatura é o Bilingual Interactive Activation Model -BIA (van Heuven, Dijkstra, & Grainger 1998), que foi desenvolvido com base no modelo interativo de processamento de palavras (Rumelhart & McClelland 1981). Esse modelo assume um léxico compartilhado para as duas línguas, que se estende à reformulação do modelo, Bilingual Interactive Activation Plus Model -BIA+ (Dijkstra & van Heuven 2002). ...
... As mentioned previously, in deep learning, the training process aims to minimize the cost function of the neural network by changing the values of its parameters, i.e. the weights and biases. For this purpose, Gradient Descent is known as one of the most popular optimization algorithms (Rumelhart et al., 1986). This technique consists of two steps that are performed iteratively through the training dataset. ...
... As one kind of intelligence optimization algorithms, the BP multi-layer feed-forward ANN algorithm was proposed by D.E. Rumelhart firstly [31]. In complicated systems with several effective input parameters, ANN can be used to predict output data. ...
... Alguns métodos propõem a inclusão de termos na função de custo que levam em conta o somatório do quadrado dos pesos das ligações (Σw ij 2 ), o somatório do módulo dos pesos (Σ  w ij ) ou uma função logarítmica log(1+w 2 ). Weigend [4] propôs um termo de penalização da forma : ...