Connection Science

Published by Taylor & Francis
Online ISSN: 1360-0494
Print ISSN: 0954-0091
We measured turn-taking in terms of hand and head movements and asked if the global rhythm of the participants' body activity relates to word learning. Six dyads composed of parents and toddlers (M = 18 months) interacted in a tabletop task wearing motion-tracking sensors on their hands and head. Parents were instructed to teach the labels of 10 novel objects and the child was later tested on a name-comprehension task. Using dynamic time warping, we compared the motion data of all body-part pairs, within and between partners. For every dyad, we also computed an overall measure of the quality of the interaction, that takes into consideration the state of interaction when the parent uttered an object label and the overall smoothness of the turn-taking. The overall interaction quality measure was correlated with the total number of words learned.In particular, head movements were inversely related to other partner's hand movements, and the degree of bodily coupling of parent and toddler predicted the words that children learned during the interaction. The implications of joint body dynamics to understanding joint coordination of activity in a social interaction, its scaffolding effect on the child's learning and its use in the development of artificial systems are discussed.
Inputs encoded into a N-layer stack as incremental values of the refractive indices.
Test result on 30 novel examples, λ=5.0µm.  
Current work on connectionist models has been focused largely on artificial neural networks that are inspired by the networks of biological neurons in the human brain. However there are also other connectionist architectures that differ significantly from this biological exemplar. Li and Purvis (1999) proposed a connectionist learning architecture inspired by the physics associated with optical coatings of multiple layers of thin-films. The proposed model differs significantly from the widely used neuron-inspired models. With thin-film layer thicknesses serving as adjustable parameters (as compared with connection weights in a neural network) for the learning system, the optical thin-film multilayer model (OTFM) is capable of approximating virtually any kind of highly nonlinear mappings. We focus on a detailed comparison of a typical neural network model and the OTFM. We describe the architecture of the OTFM and show how it can be viewed as a connectionist learning model. We then present the experimental results of using the OTFM in solving a classification problem typical of conventional connectionist architectures
Classical metric and non-metric multidimensional scaling (MDS) variants are widely known manifold learning (ML) methods which enable construction of low dimensional representation (projections) of high dimensional data inputs. However, their use is crucially limited to the cases when data are inherently reducible to low dimensionality. In general, drawbacks and limitations of these, as well as pure, MDS variants become more apparent when the exploration (learning) is exposed to the structured data of high intrinsic dimension. As we demonstrate on artificial and real-world datasets, the over-determination problem can be solved by means of the hybrid and multi-component discrete-continuous multi-modal optimization heuristics. Its remarkable feature is, that projections onto 2D are constructed simultaneously with the data categorization (classification) compensating in part for the loss of original input information. We observed, that the optimization module integrated with ML modeling, metric learning and categorization leads to a nontrivial mechanism resulting in generation of patterns of categorical variables which can be interpreted as a heuristic charting. The method provides visual information in the form of non-convex clusters or separated regions. Furthermore, the ability to categorize the surfaces into back and front parts of the analyzed 3D data objects have been attained through self-organized structuring without supervising.
Two general information encoding techniques called relative position encoding and pattern similarity association are presented. They are claimed to be a convenient basis for the connectionist implementation of complex, short term information processing of the sort needed in common sense reasoning, semantic/pragmatic interpretation of natural language utterances, and other types of high level cognitive processing. The relationships of the techniques to other connectionist information-structuring methods, and also to methods used in computers, are discussed in detail. The rich inter-relationships of these other connectionist and computer methods are also clarified. The particular, simple forms are discussed that the relative position encoding and pattern similarity association techniques take in the author's own connectionist system, called Conposit, in order to clarify some issues and to provide evidence that the techniques are indeed useful in practice.
A neural-network ensemble is a very successful technique where the outputs of a set of separately trained neural network are combined to form one unified prediction. An effective ensemble should consist of a set of networks that are not only highly correct, but ones that make their errors on different parts of the input space as well; however, most existing techniques only indirectly address the problem of creating such a set. We present an algorithm called Addemup that uses genetic algorithms to explicitly search for a highly diverse set of accurate trained networks. Addemup works by first creating an initial population, then uses genetic operators to continually create new networks, keeping the set of networks that are highly accurate while disagreeing with each other as much as possible. Experiments on four real-world domains show that Addemup is able to generate a set of trained networks that is more accurate than several existing ensemble approaches. Experiments also show that Ad...
Set-up of section 5: Square arena of 78 by 78 cm with 4 robots (Webots simulation (left) and physical set-up (right)). There are 4 objects represented as square patches. 2 All sensors used in the simulations exist in miniature and could be used for the real Khepera robots. 3 For a wide discussion on odometry error for a similar system, see 14, 15]  
. This paper presents an experiment in collective robotics which investigates the influence of communication, of learning and of the number of robots in a specific task, namely learning the topography of an environment whose features change frequently. We propose a theoretical framework based on probabilistic modeling to describe the system's dynamics. The adaptive multi-robot system and its dynamic environment are modeled through a set of probabilistic equations which give an explicit description of the influence of the different variables of the system on the data collecting performance of the group. Further, we implement the multi-robot system in experiments with a group of Khepera robots and in simulation using Webots, a 3-D simulator of Khepera robots. The robots are controlled by a distributed architecture with an associative-memory type of learning algorithm. Results show that the algorithm allows a group of robots to keep an up-to-date account of the environmental...
In the US Navy, at the end of each sailor's tour of duty, he or she is assigned to a new job. The Navy employs some 280 people, called detailers, full time to effect these new assignments. The IDA (Intelligent Distribution Agent) prototype was designed and built to automate, in a cognitively plausible manner, the job of the human detailers. That model is being redesigned to function as a multi-agent system. This is not a trivial matter due to the fact that there would need to be approximately 350,000 individual agents. There are also many issues relating to how the agents interact and how all entities involved, including humans, exercise their autonomy. This paper describes both the IDA prototype and the MultiAgent IDA system being created from it. We will also discuss several of the major issues regarding the design, interaction, and autonomy of the various agents involved.
Any nonassociative reinforcement learning algorithm can be viewed as a method for performing function optimization through (possibly noise-corrupted) sampling of function values. We describe the results of simulations in which the optima of several deterministic functions studied by Ackley (1987) were sought using variants of REINFORCE algorithms (Williams, 1987; 1988). Some of the algorithms used here incorporated additional heuristic features resembling certain aspects of some of the algorithms used in Ackley's studies. Differing levels of performance were achieved by the various algorithms investigated, but a number of them performed at a level comparable to the best found in Ackley's studies on a number of the tasks, in spite of their simplicity. One of these variants, called REINFORCE/MENT, represents a novel but principled approach to reinforcement learning in nontrivial networks which incorporates an entropy maximization strategy. This was found to perform especially well on more hierarchically organized tasks.
Presents an alternative concept and understanding of knowledge representation in neural networks based on the assumption that (natural or artificial) neural structures are responsible for the generation of an organism's behavior, which is in interaction with its environment. The concepts of constructivism, 2nd-order cybernetics, embodiment of knowledge, and functional fitness play an important role in this context. The idea of a structural isomorphism between the environment and representing structures will be given up in favor of a more sophisticated epistemological concept and constructive relation. As an implication, knowledge becomes system relative and private. An alternative understanding of language, symbols, and communication, which is based on these epistemological and neuroscientific ideas, is discussed. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Discusses a connectionist implementation of knowledge engineering concepts and concepts related to production systems in particular. An architecture of a neural production system (NPS) and its 3rd realization (NPS3), designed to facilitate approximate reasoning, are presented. NPS3 facilitates partial match between facts and rules, variable binding, different conflict resolution strategies, and chain inference. The partial match implemented in NPS3 is demonstrated on the same test production system as used by other authors (e.g., M. Lim and Y. Takefuji [1990]). The ability of NPS3 to approximate reasoning is illustrated by reasoning over a set of simple diagnostic productions and a set of decision support fuzzy rules. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
Introduces a novel neural network architecture, termed Katamic memory (KM), that is inspired by the neurocircuitry of the cerebellum. KM is capable of (1) rapid learning of multiple sequences of patterns, and (2) robust sequence association, recognition, and completion when given a short segment as a cue, even in the face of noise and faults. KM also is capable of being scaled up in a straightforward manner, and of exhibiting integrated processing (i.e., the memory can switch between learning and performance modes on a pattern-by-pattern basis). It is argued that such robust sequence learning and association is fundamental to perceptually grounded language learning because multiple streams of data must be associated and generalized in memory for later recall. The KM's neural plausibility and its relation to other connectionist models that perform sequence learning are considered. (PsycINFO Database Record (c) 2012 APA, all rights reserved)
A Hebbian-inspired, competitive network is presented which learns to predict the typical semantic features of denoting terms in simple and moderately complex sentences. In addition, the network learns to predict the appearance of syntactically key words, such as prepositions and relative pronouns. Importantly, as a by-product of the network's semantic training, a strong form of syntactic systematicity emerges. This systematicity is exhibited even at a novel, deeper level of clausal embedding. All network training is unsupervised with respect to error feedback. A novel variant of competitive learning, and an unusual hierarchical architecture are presented. The relationship of this work to issues raised by Marcus (1998) and Phillips (2000) is explored. Keywords: Systematicity, Semantic Features, Language Acquisition, Competitive Learning, Connectionism. Systematicity from Semantic Predictability 2 1. Introduction 0 Within the last decade, simple recurrent networks (SRNs) h...
This paper describes attempts to automate the process image evolution through the use artificial neural networks. The central objective this study is to learn the user's preferences, and to apply this knowledge to evolve aesthetically pleasing images which are similar to those evolved through interactive sessions with the user. This paper presents a detailed performance analysis both the successes and shortcomings encountered in the use artificial neural network architectures. Further possibilities improving the performance of a fully automated system are also discussed
Hebb's introduction of the cell assembly concept marks the beginning of modern connectionism, yet its implications remain largely unexplored and its potential unexploited. Lately, however, promising efforts have been made to utilize recurrent connections, suggesting the timeliness of a re-examination of the cell assembly as a key element in a cognitive connectionism. Our approach emphasizes the psychological functions of activity in a cell assembly. This provides an opportunity to explore the dynamic behavior of the cell assembly considered as a continuous system, an important topic that we feel has not been given sufficient attention. A step-by-step analysis leads to an identification of characteristic temporal patterns and of necessary control systems. Each step of this analysis leads to a corresponding building block in a set of emerging equations. A series of experiments is then described that explore the implications of the theoretically derived equations in term of the time course of activity generated by a simulation under different conditions. Finally, the model is evaluated in terms of whether the various constraints deemed appropriate can be met, whether the resulting solution is robust, and whether the solution promises sufficient utility and generality.
This paper presents an Attractor Neural Network (ANN) model of Recall and Recognition. It is shown that an ANN Hopfield-based network can qualitatively account for a wide range of experimental psychological data pertaining to these two main aspects of memory retrieval. After providing simple, straight-forward definitions of Recall and Recognition in the model, a wide variety of `high-level' psychological phenomena are shown to emerge from the `low-level' neural-like properties of the network. It is shown that modeling the effect of memory load on the network's retrieval properties requires the incorporation of noise into the network's dynamics. External projections may account for phenomena related with the stored items' associative links, but are not sufficient for representing context. With low memory load, the network generates retrieval response times which have the same distribution form as that observed experimentally. Finally, estimations of the probabilities of successful Recall and Recognition are obtained, possibly enabling further quantitative examination of the model.
This paper describes Rapture --- a system for revising probabilistic knowledge bases that combines connectionist and symbolic learning methods. Rapture uses a modified version of backpropagation to refine the certainty factors of a probabilistic rule base and it uses ID3's information-gain heuristic to add new rules. Results on refining three actual expert knowledge bases demonstrate that this combined approach generally performs better than previous methods. 1 Introduction In complex domains, learning needs to be biased with prior knowledge in order to produce satisfactory results from limited training data. Recently, both connectionist and symbolic methods have been developed for biasing learning with prior knowledge (Shavlik and Towell, 1989; Fu, 1989; Ourston and Mooney, 1990; Pazzani and Kibler, 1992; Cohen, 1992). Most of these methods revise an imperfect knowledge base (usually obtained from a domain expert) to fit a set of empirical data. Some of these methods have been succ...
This paper describes a medical application of modular neural networks for temporal pattern recognition. In order to increase the reliability of prognostic indices for patients living with the Acquired Immunodeficiency Syndrome (AIDS), survival prediction was performed in a system composed of modular neural networks that classified cases according to death in a certain year of follow-up. The output of each neural network module corresponded to the probability of survival in a given year. Inputs were the values of demographic, clinical, and laboratory variables. The results of the modules were combined to produce survival curves for individuals. The neural networks were trained by backprogation and the results were evaluated in test sets of previously unseen cases. We showed that, for certain combinations of neural network modules, the performance of the prognostic index, measured by the area under the receiver operating characteristic (ROC) curve, was significantly improved (p<0.05). We...
This paper deals with the problem of variable binding in connectionist networks. Specifically, a more thorough solution to the variable binding problem based on the Discrete Neuron formalism is proposed and a number of issues arising in the solution are examined in relation to logic: consistency checking, binding generation, unification, and functions. We analyze what is needed in order to resolve these issues, and based on this analysis, a procedure is developed for systematically setting up connectionist networks for variable binding based on logic rules. This solution compares favorably to similar solutions in simplicity and completeness. ACKNOWLEDGEMENTS. I wish to thank Dave Waltz, James Pustejovsky, and Tim Hickey for many discussions that helped me to elucidate ideas contained in this paper. I am also grateful to the three anonymous reviewers for their insightful criticisms and useful suggestions. 2 1 Introduction When discussing connectionist models in relation to reasoning...
Bootstrap samples with noise are shown to be an effective smoothness and capacity control technique for training feed-forward networks and for other statistical methods such as generalized additive models. It is shown that noisy bootstrap performs best in conjunction with weight decay regularization and ensemble averaging. The two-spiral problem, a highly non-linear noise-free data, is used to demonstrate these findings. The combination of noisy bootstrap and ensemble averaging is also shown useful for generalized additive modeling, and is also demonstrated on the well known Cleveland Heart Data [7].
Sample images used in the simulation of learning @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Pattern Set
This paper specifies the main features of Brain-like, Neuronal, and Connectionist models; argues for the need for, and usefulness of, appropriate successively larger brain-like structures; and examines parallelhierarchical Recognition Cone models of perception from this perspective, as examples of such structures. The anatomy, physiology, behavior, and development of the visual system are briefly summarized to motivate the architecture of brain-structured networks for perceptual recognition. Results are presented from simulations of carefully pre-designed Recognition Cone structures that perceive objects (e.g., houses) in digitized photographs. A framework for perceptual learning is introduced, including mechanisms for generation-discovery (feedback-guided growth of new links and nodes, subject to brain-like constraints (e.g., local receptive fields, global convergencedivergence) . The information processing transforms discovered through generation are fine-tuned by feedback-guided ...
Stimulus set and localist coding of naming units 
Average within-category distances 
Between-category distances for the pairs Circles-Squares and Ellipses-Rectangles 
Neural network models of categorical perception (compression of within-category similarity and dilation of between-category differences) are applied to the symbol-grounding problem (of how to connect symbols with meanings) by connecting analog sensorimotor projections to arbitrary symbolic representations via learned category-invariance detectors in a hybrid symbolic/nonsymbolic system. Our nets are trained to categorize and name 50x50 pixel images (e.g., circles, ellipses, squares and rectangles) projected onto the receptive field of a 7x7 retina. They first learn to do prototype matching and then entry-level naming for the four kinds of stimuli, grounding their names directly in the input patterns via hidden-unit representations ("sensorimotor toil"). We show that a higher-level categorization (e.g., "symmetric" vs. "asymmetric") can learned in two very different ways: either (1) directly from the input, just as with the entry-level categories (i.e., by toil), or (2) indirectly, from...
Decision boundaries and error regions associated with approximating the a posteriori probabilities (Tumer and Ghosh, 1996). 
Error reduction ( E ave add E add ) for diierent classiier error correlations. 
Using an ensemble of classifiers, instead of a single classifier, can lead to improved generalization. The gains obtained by combining however, are often affected more by the selection of what is presented to the combiner, than by the actual combining method that is chosen. In this paper we focus on data selection and classifier training methods, in order to "prepare" classifiers for combining. We review a combining framework for classification problems that quantifies the need for reducing the correlation among individual classifiers. Then, we discuss several methods that make the classifiers in an ensemble more complementary. Experimental results are provided to illustrate the benefits and pitfalls of reducing the correlation among classifiers, especially when the training data is in limited supply. 2 1 Introduction A classifier's ability to meaningfully respond to novel patterns, or generalize, is perhaps its most important property (Levin et al., 1990; Wolpert, 1990). In...
The concepts of knowledge-based systems and machine learning are combined by integrating an expert system and a constructive neural networks learning algorithm. Two approaches are explored: embedding the expert system directly and converting the expert system rule base into a neural network. This initial system is then extended by constructively learning additional hidden units in a problem-specific manner. Experiments performed indicate that generalization of a combined system surpasses that of each system individually. Contact: Dr. Zoran Obradovi'c School of Electrical Engineering and Computer Science Washington State University Pullman, WA 99164-2752 (509) 335-6601 FAX: (509) 335-3818 COMBINING PRIOR SYMBOLIC KNOWLEDGE AND CONSTRUCTIVE NEURAL NETWORK LEARNING Justin Fletcher Zoran Obradovi'c y School of Electrical Engineering and Computer Science Washington State University, Pullman WA 99164-2752 Abstract The conce...
Human language is learned, symbolic and exhibits syntactic structure, a set of properties which make it unique among naturally-occurring communication systems. How did human language come to be as it is? Language is culturally transmitted and cultural processes may have played a role in shaping language. However, it has been suggested that the cultural transmission of language is constrained by some language-specific innate endowment. The primary objective of the research outlined in this paper is to investigate how such an endowment would influence the acquisition of language and the dynamics of the repeated cultural transmission of language.
The main aim of this paper is to explore three connectionist stances on innateness. The argument is that connectionists can be nativists without necessarily being, for want of a better term, `symbolic nativists'. Also, connectionist versions of the innateness hypothesis can be just as strong in their commitment to nativism as symbolic versions. In addition, it will be claimed that these three connectionist stances on nativism concern synthetic a priori knowledge. Since the synthetic a priori is essentially a category of knowledge based on rationalist philosophy, the conclusion is that connectionism is compatible with rationalism. (This paper was published in Connection Science, 4 (3-4), 1992, pp. 271--292.) 1 Introduction Ramsey and Stich (1991) assess connectionist language models with respect to nativist claims (Chomsky, 1965; 1966) that human beings have a rich store of innate knowledge. These claims are supported by the `fact' that it would be impossible for children to learn lang...
A number of connectionist models capable of representing data with compositional structure have recently appeared. These new models suggest the intriguing possibility of performing holistic structure-sensitive computations with distributed representations. Two possible forms of holistic inference, transformational inference and confluent inference, are identified and compared. Transformational inference was successfully demonstrated in [Chalmers, 1990]; however, since the pure transformational approach does not consider the eventual inference tasks during the process of learning its representations, there is a drawback that the holistic transformation corresponding to a given inference task could become arbitrarily complex, and thus very difficult to learn. Confluent inference addresses this drawback by achieving a tight coupling between the distributed representations of a problem and the solution for the given inference task while the net is still learning its representations. A dual...
It has been claimed that connectionist methods of encoding compositional structures, such as Pollack's RAAM, support a non-classical form of structure sensitive operation known as holistic computation, where symbol structures can be acted upon holistically without the need to decompose them, or to perform a search to locate or access their constituents. In this paper, it is argued that the concept as described in the literature is vague and confused, and a revised definition of holistic computation is proposed which aims to clarify the issues involved. It is also argued that holistic computation neither requires a highly distributed or holistic representation, nor is unique to connectionist methods of representing compositional structure. 2 James A. Hammerton 1 Introduction Developing the ability to represent and process compositional structures, such as lists and trees, within neural networks has been the focus of a considerable amount of recent connectionist research. The lack of t...
We describe an alternate approach to visual recognition of handwritten words, wherein an image is converted into a spatio-temporal signal by scanning it in one or more directions, and processed by a suitable connectionist network. The scheme offers several attractive features including shift-invariance, explication of local spatial geometry along the scan direction, a significant reduction in the number of free parameters, the ability to process arbitrarily long images along the scan direction, and a natural framework for dealing with the segmentation/recognition dilemma. Other salient features of the work include the use of a modular and structured approach for network construction and the integration of connectionist components with a procedural component to exploit the complementary strengths of both techniques. The system consists of two connectionist components and a procedural controller. One network concurrently makes recognition and segmentation hypotheses, and another...
Essentially all work in connectionist learning up to now has been induction from examples (e.g. Hinton 1987), but instruction is as important in symbolic artificial intelligence (e.g. Mostow 1986, Rychener 1986) as it is in nature. This paper describes an implemented connectionist learning system that transforms an instruction expressed in a description language into an input for a connectionist knowledge representation system, which in turn changes the network in order to integrate new knowledge. Integration is always important when a single new fact causes changes in several parts of the knowledge-base; it is an adjustment which cannot easily be done with learningby -example techniques only. The new, integrated knowledge can be used in conjunction with prior knowledge. The learning method used is recruitment learning, a technique which converts network units from a pool of free units into units which carry meaningful information, i.e. represent generic concepts. 1. Introduction. Th...
Inputs encoded into a N-layer stack as incremental values of the refractive indices.
Test result on 30 novel examples, λ=5.0µm.
Current work on connectionist models has been focused largely on artificial neural networks that are inspired by the networks of biological neurons in the human brain. However there are also other connectionist architectures that differ significantly from this biological exemplar. We proposed a novel connectionist learning architecture inspired by the physics associated with optical coatings of multiple layers of thin-films in a previous paper [1]. The proposed model differs significantly from the widely used neuron-inspired models. With thin-film layer thicknesses serving as adjustable parameters (as compared with connection weights in a neural network) for the learning system, the optical thin-film multilayer model (OTFM) is capable of approximating virtually any kind of highly nonlinear mappings. In this paper we focus on a detailed comparison of a typical neural network model and the OTFM. We describe the architecture of the OTFM and show how it can be viewed as a connectionist learning model. We then present the experimental results of using the OTFM in solving a classification problem typical of conventional connectionist architectures. 1
Ratio of the number of epochs required for relearning the original 20 patterns to the number of epochs initially required to learn these patterns (following learning of the 20 disruptive patterns), as a function of the number of pseudopatterns.  
The increase in the percentage of items of the originally learned set exactly recognized by the network (after learning the set of disruptive items), as a function of the number of pseudopatterns.  
The improvement in the recognition of the previously learned cats after interference by either 2 or 4 dogs. The amount of forgetting is measured by the average error of the autoassociation.
Decreased forgetting of the original data with respect to standard backpropagation (measured in number of epochs required for complete relearning).  
In order to solve the "sensitivity-stability" problem --- and its immediate correlate, the problem of sequential learning --- it is crucial to develop connectionist architectures that are simultaneously sensitive to, but not excessively disrupted by, new input. French (1992) suggested that to alleviate a particularly severe form of this disruption, catastrophic forgetting, it was necessary for networks to dynamically separate their internal representations during learning. McClelland, McNaughton, & O'Reilly (1995) went even further. They suggested that nature's way of implementing this obligatory separation was the evolution of two separate areas of the brain, the hippocampus and the neocortex. In keeping with this idea of radical separation, a "pseudo-recurrent" memory model is presented here that partitions a connectionist network into two functionally distinct, but continually interacting areas. One area serves as a final-storage area for representations; the other is an e...
This paper presents a new connectionist approach to grammatical inference. Using only positive examples, the algorithm learns regular graph grammars, representing two-dimensional iterative structures drawn on a discrete Cartesian grid. This work is intended as a case study in connectionist symbol processing and geometric conceptformation. A grammar is represented by a self-configuring connectionist network that is analogous to a transition diagram except that it can deal with graph grammars as easily as string grammars. Learning starts with a trivial grammar, expressing no grammatical knowledge, which is then refined, by a process of successive node splitting and merging, into a grammar adequate to describe the population of input patterns. In conclusion, I argue that the connectionist style of computation is, in some ways, better suited than sequential computation to the task of representing and manipulating recursive structures. 1. Introduction Connectionism is conventionally seen ...
Numeracy is regarded as an emergent property of the human brain, suggesting that neural network based simulations may provide some insight into the cerebral substrate used in operations related to numeracy. Two operations, subitization the so-called phenomenon of the discrimination of visual number and counting a recurrent operation have been studied within a multi-net framework. A multi-net architecture comprising unsupervised networks has been developed which successfully simulates aspects of subitization, especially when compared to similar work using supervised learning algorithms. Another multi-net architecture comprising unsupervised networks, and a recurrent backpropagation network, appears to learn numerosity and successfully simulates errors children make when they are learning to count. The systems for subitizing and counting were incorporated into a gated multi-net system for simulating the dual existence of both subitization and counting. Multi-net architectures provide a good basis for studying the emergent properties of an intelligent system in that a single monolithic network may be used to fit almost any data available.
We describe a decorrelation network training method for improving the quality of regression learning in "ensemble " neural networks that are composed of linear combinations of individual neural networks. In this method, individual networks are trained by backpropagation to not only reproduce a desired output, but also to have their errors be linearly decorrelated with the other networks. Outputs from the individual networks are then linearly combined to produce the output of the ensemble network. We demonstrate the performances of decorrelated network training on learning the "3 Parity" logic function, a noisy sine function, and a one dimensional nonlinear function, and compare the results with the ensemble networks composed of independently trained individual networks (without decorrelation training). Empirical results show that when individual networks are forced to be decorrelated with one another the resulting ensemble neural networks have lower mean squared errors than the ensembl...
We introduce a new approach to the training of classifiers for performance on multiple tasks.
An approach to develop new game playing strategies based on artificial evolution of neural networks is presented. Evolution was directed to discover strategies in Othello against a random-moving opponent and later against an ff-fi search program. The networks discovered first a standard positional strategy, and subsequently a mobility strategy, an advanced strategy rarely seen outside of tournaments. The latter discovery demonstrates how evolutionary neural networks can develop novel solutions by turning an initial disadvantage into an advantage in a changed environment. 1 Introduction Game playing is one of the oldest and most extensively studied areas of artificial intelligence. Games require sophisticated intelligence in a well-defined problem where success is easily measured. Games have therefore proven to be important domains for studying problem solving techniques. Most research in game playing has centered on creating deeper searches through the possible game scenarios. Deepe...
There has been much interest in the possibility of connectionist models whose representations can be endowed with compositional structure, and a variety of such models have been proposed. These models typically use distributed representations that arise from the functional composition of constituent parts. Functional composition and decomposition alone, however, yield only an implementation of classical symbolic theories. This paper explores the possibility of moving beyond implementation by exploiting holistic structure-sensitive operations on distributed representations. An experiment is performed using Pollack’s Recursive Auto-Associative Memory. RAAM is used to construct distributed representations of syntactically structured sentences. A feed-forward network is then trained to operate directly on these representations, modeling syntactic transformations of the represented sentences. Successful training and generalization is obtained, demonstrating that the implicit structure present in these representations can be used for a kind of structure-sensitive processing unique to the connectionist domain. 1
Representation poses important challenges to connectionism. The ability to structurally compose representations is critical in achieving the capability considered necessary for cognition. We are investigating distributed patterns that represent structure as part of a larger effort to develop a natural language processor. Recursive Auto-Associative Memory (RAAM) representations show unusual promise as a general vehicle for representing classical symbolic structures in a way that supports compositionality. However, RAAMs are limited to representations for fixed-valence structures and can often be difficult to train. We provide a technique for mapping any ordered collection (forest) of hierarchical structures (trees) into a set of training patterns which can be used effectively in training a simple recurrent network (SRN) to develop RAAM-style distributed representations. The advantages in our technique are three-fold: first, the fixed-valence restriction on structures represented by patterns trained with RAAMs is removed; second, the representations resulting from training correspond to ordered forests of labeled trees thereby extending what can be represented in this fashion; and third, training can be accomplished with an auto-associative SRN, making training a much more straightforward process and one which optimally utilizes the ndimensional space of patterns. This material is based upon work supported by the National Science Foundation under Grant No. IRI-9201987. DRAFT July 1, 1994 DRAFT 1.
Most known learning algorithms for dynamic neural networks in non-stationary environments need global computations to perform credit assignment. These algorithms either are not local in time or not local in space. Those algorithms which are local in both time and space usually can not deal sensibly with `hidden units'. In contrast, as far as we can judge by now, learning rules in biological systems with many `hidden units' are local in both space and time. In this paper we propose a parallel on-line learning algorithm which performs local computations only, yet still is designed to deal with hidden units and with units whose past activations are `hidden in time'. The approach is inspired by Holland's idea of the bucket brigade for classifier systems, which is transformed to run on a neural network with fixed topology. The result is a feedforward or recurrent `neural' dissipative system which is consuming `weight-substance' and permanently trying to distribute this substance onto its co...
A Multitask Learning (MTL) network. There is an output node for each task being learned in parallel. The representation formed in the lower portion of the network is common to all tasks.
With a distinction made between two forms of task knowledge transfer, representational and functional, ηMTL, a modified version of the MTL method of functional (parallel) transfer, is introduced. The ηMTL method employs a separate learning rate, η k , for each task output node k, η k varies as a function of a measure of relatedness, R k , between the th task and the primary task of interest. Results of experiments demonstrate the ability of ηMTL to dynamically select the most related source task(s) for the functional transfer of prior domain knowledge. The ηMTL method of learning is nearly equivalent to standard MTL when all parallel tasks are sufficiently related to the primary task, and is similar to single task learning when none of the parallel tasks are related to the primary task.
In this paper I claim that one of the main characteristics that makes the Evolutionary Robotics approach suitable for the study of adaptive behavior in natural and artificial agents is the possibility to rely largely on a self-organization process. Indeed by using Artificial Evolution the role of the designer may be limited to the specification of a fitness function which measures the ability of a given robot to perform a desired task. From an engineering point of view the main advantage of relying on self-organization is the fact that the designer does not need to divide the desired behavior into simple basic behaviors to be implemented into separate layers (or modules) of the robot control system. By selecting individuals for their ability to perform the desired behavior as a whole, simple basic behaviors can emerge from the interaction between several processes in the control system and from the interaction between the robot and the environment. From the point of view of the study o...
The hierarchical feature map system recognizes an input story as an instance of a particular script by classifying it at three levels: scripts, tracks and role bindings. The recognition taxonomy, i.e. the breakdown of each script into the tracks and roles, is extracted automatically and independently for each script from examples of script instantiations in an unsupervised self-organizing process. The process resembles human learning in that the differentiation of the most frequently encountered scripts become gradually the most detailed. The resulting structure is a hierachical pyramid of feature maps. The hierarchy visualizes the taxonomy and the maps lay out the topology of each level. The number of input lines and the self-organization time are considerably reduced compared to the ordinary single-level feature mapping. The system can recognize incomplete stories and recover the missing events. The taxonomy also serves as memory organization for script-based episodic memory. The map...
A Metrical Structure Hierarchy (Lerdahl & Jackendoff, 1983). Each horizontal row of dots represents a level of beats, and the relative spacing between dots describes the relationship between the beat periods of adjacent levels. Points where beats of many levels align describe points of metrical accent.  
A Periodic Signal and the Response of an Integrate-and-Fire Oscillator A) The oscillator in the absence of stimulation. When activation reaches the threshold, the oscillator " fires " . The period of the resulting oscillation is determined by the slope of the activation function and the height of the firing threshold. B) Phase-tracking. Discrete periodic stimulus effects the oscillator by lowering its firing threshold. The oscillator comes into phase and frequency lock with the periodic stimulus. The effect is temporary, however. When the stimulus is removed, the oscillator reverts to its intrinsic period. C) Frequency-tracking. By adjusting its firing threshold in response to stimulus, the unit may achieve permanent or semi-permanent frequency lock. When the stimulus is removed, the oscillator continues to fire at the stimulus period.  
An oscillator responding to periodic stimulation at 660ms. Initially, the oscillator's period is 700ms. After a few stimulus cycles, the oscillator adjusts its period to 660ms. A) Periodic stimulus. B) Oscillator response. C) Oscillator period.  
Transcription of the improvised melody from Figure 8. Grace notes are not transcribed.
Many connectionist approaches to musical expectancy and music composition let the question of "What next?" overshadow the equally important question of "When next?". One cannot escape the latter question, one of temporal structure, when considering the perception of musical meter. We view the perception of metrical structure as a dynamic process where the temporal organization of external musical events synchronizes, or entrains, a listener's internal processing mechanisms. This article introduces a novel connectionist unit, based upon a mathematical model of entrainment, capable of phase- and frequency-locking to periodic components of incoming rhythmic patterns. Networks of these units can self-organize temporally structured responses to rhythmic patterns. The resulting network behavior embodies the perception of metrical structure. The article concludes with a discussion of the implications of our approach for theories of metrical structure and musical expectancy. Connection Science...
This paper presents a novel approach to the problem of action selection for an autonomous agent. An agent is viewed as a collection of com- petence modules. Action selection is modeled as an emergent property of an activation/inhibition dynamics among these modules. A con- crete action selection algorithm is presented and a detailed account of the results is given. This algorithm combines characteristics of both traditional planners and reactive systems: it produces fast and robust activity in a tight interaction loop with the environment, while at the same time allowing for some prediction and planning to take place. It provides global parameters, which one can use to tune the action selection behavior to the characteristics of the task environment. As such one can smoothly trade off goal-orientedness for situation-orientedness, bias towards ongoing plans (inertia) for adaptivity, thoughtfulness for speed, and adjust its sensitivity to goal conflicts.
In this paper we discuss the limitations of current evolutionary robotics models and we propose a new framework that might solve some of these problems and lead to an open-ended evolutionary process in hardware. More specifically, the paper describes a novel approach, where the usual concepts of population, generations and fitness are made implicit in the system. Individuals co-evolve embedded in their environment. Exploiting the self-assembling capabilities of the (simulated) robots, the genotype of a successful individual can spread in the population. In this way, interesting behaviours spontaneously emerge, resulting in chasing and evading other individuals, collective obstacle avoidance, coordinated motion of self-assembled structures.
We present a model of absolute autonomy and power in agent systems. This absolute sense of autonomy captures the agent's liberty over an agent's preferences. Our model characterizes an affinity between autonomy and power. We argue that agents with similar individual autonomy and power experience an adjusted level of autonomy and power due to being in a group of like agents. We then illustrate our model on the problem of task allocation.
Distribution of times elapsed between sending a random motor command to the robot's arm and the perception of the resultant motion after the visual image has passed through the whole processing pipeline. The robot iterated through ten randomly selected motions 22 times.
The distribution of detected social responses for the robot. It appears to follow a negative exponential function, corresponding to the idea that humans generally respond socially as quickly as they can. Background Poisson events follow a negative exponential distribution with a longer tail, making the upper bound of the time window important and the lower bound mostly irrelevant in this domain.
Delays measured during conversations between the experimenter and the two social interaction subjects. Originally taken as potential data for a Bayesian estimator of the probability of social interaction, the data suggests that real social interactions have negative binomial or Poissonlike delay distributions.  
The robot identifies the self-generated motion of its reflection. The shot is from one of Nico's wide-angle cameras; Nico's reflection (center) is moving its hand as Nico moves a real arm in the foreground (bottom right). Because the motion occurs within the the learned time interval [t 1 min , t 1max ], the motion is tagged as potentially self-generated (highlights).  
By learning a range of possible times over which the effect of an action can take place, a robot can reason more effectively about causal and contingent relationships in the world. However, learning these time windows in a noisy environment where random events interfere can pose a challenge. We present an algorithm for learning the interval (t1min,t1max) of possible times during which a response to an action can take place, and implement the model on a physical robot for the domains of visual self-recognition and auditory social-partner recognition. The environment model that we use to justify our error bounds assumes that natural environments generate Poisson distributions of random events at all scales. From this assumption, we derive a linear- time algorithm, which we call Poisson threshold learning, for finding a threshold T that provides an arbitrarily small rate of background events (T) if such a threshold exists for the specified error rate. We can then use this rate to calculate an expected number of false positives in our sample data and discard them. We implement the principles of our method using a motion detection module as our input stream in the visual domain, and sampled audio energy in the auditory domain. In this way, we find time windows for self-generated motion, self-generated audio, and verbal social responses. We also present data on the distributions of these events, showing that while our self-generated action had a normal distribution, the social events were better modeled by a Poisson process. Finally, we present several applications for which such simple classifiers could potentially prove useful, such as mirror self- recognition and learning the meanings of the words "I" and "you."
The capacity for infants to form mental representations of hidden or occluded objects can be decomposed into two tasks: one process that identifies salient objects and a second complementary process that identifies salient locations. This functional decomposition is supported by the distinction between dorsal and ventral extrastriate visual processing in the primate visual system. This approach is illustrated by presenting an eye-movement model that incorporates both dorsal and ventral process- ing streams and by using the model to simulate infants' reactions to possible and impossible events from an infant looking-time study (R. Baillargeon, "Representing the existence and the location of hidden objects: object permanence in 6- and 8-month-old infants", Cognition, 23, pp. 21-41, 1986.). As expected, the model highlights how the dorsal system is sensitive to the location of a key feature in these events (i.e. the location of an obstacle), whereas the ventral system responds equivalently to the possible and impossible events. These results are used to help explain infants'reactions in looking-time studies.
In handwriting, the drawing or copying of an individual letter involves a process of linearizing, whereby the form of the letter is broken down into a temporal sequence of strokes for production. In experienced writers, letters are produced consistently using the same production methods that are economic in terms of movement. This regularity permits a rule-based description of such production processes, which can be used in the teaching of handwriting skills. In this paper, the outstanding question from rule-based descriptions as to how consistent and stable letter production behaviour emerges as a product of practice and experience is addressed through the implementation of a connectionist model of sequential letter production. This model: (1) examines the emergence of letter production behaviour, namely the linearizing process; (2) explores how letters may be internally represented across both spatial and temporal dimensions; and (3) investigates the impact of learning certain letter production methods when generalizing to produce novel letterforms. In conclusion, the connectionist model offers an emergent account of letter production behaviour, which addresses the co-representation of spatial and temporal dimensions of letters, and the impact of learning experiences upon behaviour.
Correct responses on the ‘learned’ items (recognition) and on the ‘new’ items (detection). The subjects are able to distinguish the ‘learned’ items from the ‘new’ items. 
Correct responses on the ‘learned’ items (recognition) and on the ‘new’ items (generalization). The subjects are able to generalize to new items. 
While retroactive interference (RI) is a well-known phenomenon in humans, the differential effect of the structure of the learning material was only seldom addressed. Mirman and Spivey (200124. Mirman , D and Spivey , M . 2001. Retroactive interference in neural networks and in humans: the effect of pattern-based learning. Connection Science, 13: 257–275. View all references, Connection Science, 13: 257–275) reported on behavioural results that show more RI for the subjects exposed to ‘Structured’ items than for those exposed to ‘Unstructured’ items. These authors claimed that two complementary memory systems functioning on radically different neural mechanisms are required to account for the behavioural results they reported. Using the same paradigm but controlling for proactive interference, we found the opposite pattern of results, that is, more RI for subjects exposed to ‘Unstructured’ items than for those exposed to ‘Structured’ items (experiment 1). Two additional experiments showed that this structure effect on RI is a genuine one. Experiment 2 confirmed that the design of experiment 1 forced the subjects from the ‘Structured’ condition to learn the items at the exemplar level, thus allowing for a close match between the two to-be-compared conditions (as ‘Unstructured’ condition items can be learned only at the exemplar level). Experiment 3 verified that the subjects from the ‘Structured’ condition could generalize to novel items. Simulations conducted with a three-layer neural network, that is, a single-memory system, produced a pattern of results that mirrors the structure effect reported here. By construction, Mirman and Spivey's architecture cannot simulate this behavioural structure effect. The results are discussed within the framework of catastrophic interference in distributed neural networks, with an emphasis on the relevance of these networks to the modelling of human memory.
Top-cited authors
Michael C. Mozer
  • University of Colorado Boulder
Paul Smolensky
  • Johns Hopkins University
Kim Plunkett
  • University of Oxford
Chris Sinha
  • University of East Anglia
Martin Møller
  • Alexandra Instituttet