[Show abstract][Hide abstract] ABSTRACT: A developing cognitive system will ideally acquire knowledge of its interaction in the world, and will be able to use that knowledge to construct a scaffolding for progressively structured levels of behavior. The current research implements and tests an autobiographical memory system by which a humanoid robot, the iCub, can accumulate its experience in interacting with humans, and extract regularities that characterize this experience. This knowledge is then used in order to form composite representations of common experiences. We first apply this to the development of knowledge of spatial locations, and relations between objects in space. We then demonstrate how this can be extended to temporal relations between events, including “before” and “after,” which structure the occurrence of events in time. In the system, after extended sessions of interaction with a human, the resulting accumulated experience is processed in an offline manner, in a form of consolidation, during which common elements of different experiences are generalized in order to generate new meanings. These learned meanings then form the basis for simple behaviors that, when encoded in the autobiographical memory, can form the basis for memories of shared experiences with the human, and which can then be reused as a form of game playing or shared plan execution.
IEEE Transactions on Autonomous Mental Development 09/2014; 6(3):200-212. DOI:10.1109/TAMD.2014.2307342 · 1.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Arbib's How the brain got language is a major achievement in defining a trajectory for the evolution of complex imitation and the language-ready brain leading to human language. In addition to these capabilities, I will suggest that it is useful to consider two additional components of human brain function that are intricately related to the emergence of language. These are, first, the profound human motivation to represent and share the psychological states of others, and second, the related complex semantic system that represents the contents of what is communicated in language. In this sense, these two components represent part of what is under the iceberg, where language is the emerging tip.
Language and Cognition 09/2014; 5(2-3):177-188. DOI:10.1515/langcog-2013-0012
[Show abstract][Hide abstract] ABSTRACT: A method is proposed where static patterns or snapshots of cortical activity that could be stored as hyperassociative indices in hippocampus can subsequently be retrieved and reinjected into the neocortex in order to enable neocortex to then proceed to unfold the corresponding sequence, thus implementing an index-based sequence memory storage and retrieval capability.
[Show abstract][Hide abstract] ABSTRACT: One of the most paradoxical aspects of human language is that it is so unlike any other form of behavior in the animal world, yet at the same time, it has developed in a species that is not far removed from ancestral species that do not possess language. While aspects of non-human primate and avian interaction clearly constitute communication, this communication appears distinct from the rich, combinatorial and abstract quality of human language. So how does the human primate brain allow for language? In an effort to answer this question, a line of research has been developed that attempts to build a language processing capability based in part on the gross neuroanatomy of the corticostriatal system of the human brain. This paper situates this research program in its historical context, that begins with the primate oculomotor system and sensorimotor sequencing, and passes, via recent advances in reservoir computing to provide insight into the open questions, and possible approaches, for future research that attempts to model language processing. One novel and useful idea from this research is that the overlap of cortical projections onto common regions in the striatum allows for adaptive binding of cortical signals from distinct circuits, under the control of dopamine, which has a strong adaptive advantage. A second idea is that recurrent cortical networks with fixed connections can represent arbitrary sequential and temporal structure, which is the basis of the reservoir computing framework. Finally, bringing these notions together, a relatively simple mechanism can be built for learning the grammatical constructions, as the mappings from surface structure of sentences to their meaning. This research suggests that the components of language that link conceptual structure to grammatical structure may be much simpler that has been proposed in other research programs. It also suggests that part of the residual complexity is in the conceptual system itself.
Frontiers in Psychology 08/2013; 4:500. DOI:10.3389/fpsyg.2013.00500 · 2.80 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: One of the defining characteristics of human cognition is our outstanding capacity to cooperate. A central requirement for cooperation is the ability to establish a “shared plan”-which defines the interlaced actions of the two cooperating agents-in real time, and even to negotiate this shared plan during its execution. In the current research we identify the requirements for cooperation, extending our earlier work in this area. These requirements include the ability to negotiate a shared plan using spoken language, to learn new component actions within that plan, based on visual observation and kinesthetic demonstration, and finally to coordinate all of these functions in real time. We present a cognitive system that implements these requirements, and demonstrate the system's ability to allow a Nao humanoid robot to learn a nontrivial cooperative task in real-time. We further provide a concrete demonstration of how the real-time learning capability can be easily deployed on a different platform, in this case the iCub humanoid. The results are considered in the context of how the development of language in the human infant provides a powerful lever in the development of cooperative plans from lower-level sensorimotor capabilities.
IEEE Transactions on Autonomous Mental Development 03/2013; 5(1):3-17. DOI:10.1109/TAMD.2012.2209880 · 1.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Robots should be capable of interacting in a cooperative and adaptive manner with their human counterparts in open-ended tasks that can change in real-time. An important aspect of the robot behavior will be the ability to acquire new knowledge of the cooperative tasks by observing and interacting with humans. The current research addresses this challenge. We present results from a cooperative human-robot interaction system that has been specifically developed for portability between different humanoid platforms, by abstraction layers at the perceptual and motor interfaces. In the perceptual domain, the resulting system is demonstrated to learn to recognize objects and to recognize actions as sequences of perceptual primitives, and to transfer this learning, and recognition, between different robotic platforms. For execution, composite actions and plans are shown to be learnt on one robot and executed successfully on a different one. Most importantly, the system provides the ability to link actions into shared plans, that form the basis of human-robot cooperation, applying principles from human cognitive development to the domain of robot cognitive systems.
IEEE Transactions on Autonomous Mental Development 09/2012; 4(3):239-253. DOI:10.1109/TAMD.2012.2199754 · 1.48 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Robots are now physically capable of locomotion, object manipulation, and an essentially unlimited set of sensory motor behaviors.
This sets the scene for the corresponding technical challenge: how can non-specialist human users interact with these robots
for human robot cooperation? Crangle and Suppes stated in  : “the user should not have to become a programmer, or rely
on a programmer, to alter the robot’s behavior, and the user should not have to learn specialized technical vocabulary to
request action from a robot.” To achieve this goal, one option is to consider the robot as a human apprentice and to have
it learn through its interaction with a human. This chapter reviews our approach to this problem.
[Show abstract][Hide abstract] ABSTRACT: This paper reports current experiments conducted on HRP-2 based research on robot autonomy. The contribution of the paper is not focused on a specific area but its objective is to highlight the critical issues that had to be solved to allow the humanoid robot HRP-2 to understand and execute the order ldquogive me the purple ballrdquo in an autonomous way. Such an experiment requires: simple object recognition and localization, motion planning and control, natural spoken language supervision, simple action supervisor and control architecture.
Humanoid Robots, 2007 7th IEEE-RAS International Conference on; 01/2008
[Show abstract][Hide abstract] ABSTRACT: An apprentice is an able-bodied individual that should interactively assist an expert, and through this interaction, they should acquire knowledge and skill in the given task domain. In this context the robot should have a useful repertoire of sensory-motor acts that the human can command with spoken language. In order to address the additional requirements for learning new behaviors, the robot should additionally have a real-time behavioral sequence acquisition capability. The learned sequences should function as executable procedures that can operate in a flexible manner that are not rigidly sensitive to initial conditions. The current research develops these capabilities in a real-time control system for the HRP-2 humanoid. The task domain involves a human and the HRP-2 working together to assemble a piece of furniture. We previously defined a system for Spoken Language Programming (SLP) that allowed the user to guide the robot through an arbitrary, task relevant, motor sequence via spoken commands, and to store this sequence as re-usable macro. The current research significantly extends the SPL system: It integrates vision and motion planning into the SLP framework, providing a new level of flexibility in the behviora that can be created. Most important it allows the user to create ldquogenericrdquo functions with arguments (e.g. Give me X), and it allows multiple functions to be created. We thus demonstrate - for the first time - a humanoid robot equipped with vision based grasping, and the ability to acquire multiple sensory motor behavioral procedures in real-time through SLP in the context of a cooperative task. The humanoid robot thus acquires new sensory motor skills that significantly facilitate the cooperative human-robot interaction.
Humanoid Robots, 2007 7th IEEE-RAS International Conference on; 01/2008
[Show abstract][Hide abstract] ABSTRACT: The current research presents an original model allowing a machine to acquire new behaviors via its cooperative interaction with a human user. One of specificities of this system is to place the interaction at the heart of the learning. Thus, as one proceeds with exchanges, the robot improves its behaviors favoring a smoother and more natural interaction. Two experiments demonstrate the robustness of this approach in learning composite perceptual-motor behavioral sequences of varying complexity.
Humanoid Robots, 2006 6th IEEE-RAS International Conference on; 01/2007
[Show abstract][Hide abstract] ABSTRACT: This commentary analyzes the target article to determine whether shared-intention development could be implemented and tested in robotic systems. The analysis indicates that such an implementation should be feasible and will likely rely on a construction-based approach similar to that employed in the construction grammar framework.
[Show abstract][Hide abstract] ABSTRACT: The development of high performance humanoid robots provide complex systems with which humans must interact, and levy serious requirements on the quality and depth of these interactions. At the same time, developments in spoken language technology, and in theories of social cognition and intentional cooperative behavior provide the technical basis and theoretical background respectively for the technical specification of how these systems can work. The objective of the current research is to develop a generalized approach for human-machine interaction via spoken language that exploits recent developments in cognitive science - particularly notions of grammatical constructions as form-meaning mappings in language, and notions of shared intentions as distributed plans for interaction and collaboration. We demonstrate this approach on two distinct robot platforms with human-robot interaction at three levels. The first level is that of commanding or directing the behavior of the system. The second level is that of interrogating or requesting an explanation from the system. The third and most advanced level is that of teaching the machine a new form of behavior. Within this context, we exploit social interaction in two manners. First, the robot identifies different human collaborators, and maintain a permanent record of their interactions in order to treat novices and experts in distinct manners. Second, the interactions are structured around shared intentions that guide the interactions in an ergonomic manner. We explore these aspects of communication on two distinct robotic platforms, the "Event Perceiver" and the Sony Aibo ERS7, and provide in the current paper the state of advancement of this work, and the initial lessons learned
Humanoid Robots, 2005 5th IEEE-RAS International Conference on; 02/2005
[Show abstract][Hide abstract] ABSTRACT: Phrasal semantics is concerned with how the meaning of a sentence is composed both from the meaning of the constituent words, and from extra meaning contained within the structural organization of the sentence itself. In this context, grammatical constructions correspond to form-meaning mappings that essentially capture this "extra" meaning and allow its repre-sentation. The current research examines how a computational model of language processing based on a construction grammar approach can account for aspects of descriptive, referential and information content of phrasal semantics.
[Show abstract][Hide abstract] ABSTRACT: In previous research, we developed an integrated platform that combined visual scene interpretation with speech processing to provide input to a language-learning model. The system was demonstrated to learn a rich set of sentence-meaning mappings that could allow it to construct the appropriate meanings for new sentences in a generalization task. While this demonstrated potential promise, it fell short in several aspects of providing a useful human-robot interaction system. The current research addresses three of these shortcomings, demonstrating the natural extensibility of the platform architecture. First, the system must be able not only to understand what it hears, but also to describe what it sees and to interact with the human user. This is a natural extension of the knowledge of sentence-to-meaning mappings that is now applied in the inverse scene-to-sentence sense. Secondly, we extend the system's ontology from physical events to include spatial relations. We show that spatial relations are naturally accommodated in the predicate argument representations for events. Finally, because the robot community is international the robot should be able to speak multiple languages, we thus demonstrate that the language model extends naturally to include both English and Japanese. Concrete results from a working interactive system are presented and future directions for adaptive human-robot interaction systems are outlined.
Humanoid Robots, 2004 4th IEEE/RAS International Conference on; 12/2004
[Show abstract][Hide abstract] ABSTRACT: Pickering & Garrod (P&G) describe a mechanism by which the situation models of dialog participants become progressively aligned via priming at different levels. This commentary attempts to characterize how alignment and routinization can be extended into the language acquisition domain by establishing links between alignment and joint attention, and between routinization and grammatical construction learning. Pickering & Garrod (P&G) describe a mechanism by which the situation models of dialogue participants become progressively aligned via priming at different levels, including lexical, syntactic, semantic, and situational representations. An essential interest and novelty of this approach is that, instead of requiring a complex and effortful mechanism for explicitly constructing a common ground, it offers a rather straightforward mechanism that operates largely automatically via priming. It is of potential interest that this type of alignment can be seen to be useful in other communicative contexts besides dialogue. Two such contexts can be considered, both of which extend the situation alignment mechanism into the domain of language ac-quisition. The first concerns the alignment of situation models in which one of the interlocutors is in a prelingual, acquisition phase. This emphasizes the suggestion that alignment can take place via nonverbal influences. Second, in the current formulation, the process of alignment and the formation of routines takes place on the time scale of single dialogues; however, these mechanisms can also be considered to span time frames that greatly exceed a sin-gle dialogue, particularly in the case of familiar repeated situations (feeding, bathing, playing), yielding "virtual dialogues" that can span a time period of several months. In such a situation, we can consider the formation of routines in the context of language ac-quisition to be analogous to the development of grammatical con-structions. Language acquisition can be functionally defined as the process of establishing the relation between sentences/discourses and their meanings. A significant part of this problem concerns the is-sue that before these relations can be established, the speaker and listener should be aligned with respect to the target meaning. If the meaning for the target utterance is not established both for the speaker and the listener, then construction of the mapping from utterance to meaning is indeterminate. This suggests the required existence of extra-or prelinguistic alignment mechanisms. Inter-estingly, there is indeed a significant body of research indicating that by six months of age, human infants achieve prelinguistic sit-uation alignment by exploiting joint attention cues (e.g., gaze di-rection, postural orientation) in order to identify intended refer-ents (e.g., Morales et al. 2000; Tomasello 2003). This indicates that P&G's Figure 2 could be modified to include nonlinguistic inputs at the semantic and situation model levels. Such a modification will allow both the "alignment bootstrapping" in which initial sit-uation model alignment will play a crucial role in language acqui-sition as well as the influence of extralinguistic inputs in adult alignment contexts. In a related extension of the alignment model into the acquisi-tion domain, we can consider the relation between the develop-ment of production and comprehension routines in the time frame of a single dialogue and the development of grammatical constructions in the time frame of the first years of language ac-quisition. As specified by P&G, the creation of routines requires a coherent context in which the routines are applicable, and so, stretching this time frame to the scale of months and years is a non-negligible issue. Interestingly, Tomasello (2003) notes that repetitive events such as feeding, bathing, playing, and so on are relatively similar from episode to episode, and thus provide ap-propriate contexts that coherently span significant time periods. Given a temporally extended "virtual dialogue" domain, we can consider the development of routines as facilitatory not only within the context of a single dialogue but also in the more fun-damental role of the development of communicative conventions that span significant time periods, thus forming the basis for lan-guage acquisition. In this context, routines take on the alternative identity of grammatical constructions (see Goldberg 1995), with all of their processing advantages. In particular, as described by P&G, the use of routines significantly eliminates the need for syn-tactic derivation of the appropriate grammatical structural forms, both for production and comprehension. When this approach is applied at the acquisition time scale, it is remarkably similar to the usage-based developmental approach to language acquisition ad-vocated by Tomasello (2003). In this framework, relatively fixed grammatical forms are linked to their corresponding meanings in the context of repetitive events (e.g., feeding, playing, etc.). These constructions/routines are then progressively opened to allow generalization within a given construction (e.g., variable replacement) to form new instances, and subsequent generalization to new constructions. Again, in both P&G's dialogue context and Tomasello's development con-text, highly functional communicative form-meaning construc-tions/routines are developed without reliance on a heavy initial in-vestment in generative syntactic capabilities. I have recently performed a series of simulation (Dominey 2000) and robotic (Dominey 2003a; 2003b) experiments to deter-mine the feasibility of this type of approach to language acquisi-tion in a restricted context. The underlying assumptions in the model are (1) that grammatical constructions correspond to the learned mapping between a given sentence type and its corre-sponding meaning frame (see Goldberg 1995), and (2) that gram-matical constructions are uniquely identified by a limited set of cues that include word order and grammatical morphology in-cluding free and bound morphemes (Bates & MacWhinney 1987). The model is provided with 〈sentence, meaning〉 pairs as input and should learn the Word-to-Referent and Sentence-to-Meaning mappings. For the current discussion, we assume that a limited set of concrete, open-class elements have been learned and will con-sider how this knowledge allows the learning of simple grammat-ical constructions. When a (sentence, meaning) pair is presented, the configuration of closed-class (function) elements is extracted and used as an index to "look up" the corresponding construction (routine) in the construction inventory. The construction corre-sponds to the learned mapping of open-class element positions in the sentence onto their thematic and event roles in the meaning representation. If there is no entry in the construction inventory (i.e., the current sentence type has never been previously en-countered), then the construction is built on the fly by matching the referents for the open-class words with their respective roles in the meaning representation. The construction is then stored for future use. The developmental aspects of this learning are pre-sented in more detail in Dominey (2000). Thus, similar to P&G's routines, constructions are built by pair-ing the grammatical form with the aligned meaning (situation) representation. The interesting suggestion is that, at least to a cer-tain degree, P&G's proposed situation alignment and routine con-struction capabilities provide a mechanism for language acquisi-tion (at least the learning of fixed grammatical constructions that can generalize to new instances of the same constructions) which avoids the enlistment of generative grammar mechanisms. If a sit-uation alignment priming mechanism could be demonstrated to perform in both the dialogue and acquisition time scales, this would be evidence for an ingenious economy of functional mech-anisms for language processing in the context of dialogue.
[Show abstract][Hide abstract] ABSTRACT: Language acquisition represents one of the great learning achievements in human cognitive development. Perhaps, this process takes place in a relatively automatic manner in which, simply through exposure to language input, the child configures her language organ to coincide with the structure of the maternal language. In this context, the problem of the vast uncertainty between speech input and its external referent, related to the more general notion of the ‘poverty of the stimulus’ problem, takes on a significant importance, and motivates the nativist suggestion that language is already essentially preprogrammed, and acquisition consists of setting the parameters for the target language based on limited exposure.What if, however, the acquisition process was not so automatic, but rather was controlled by the operation of mechanisms that could direct the attention of the child to specific aspects of the sentence and its external referent? In this case, external and internal control of attention could significantly reduce the referential uncertainty, thus reducing the requirement for preprogrammed language.The current paper outlines evidence for this second scenario, in which child directed speech guides the child's attention to important aspects of the speech signal, and Joint Attention focuses his attention on the relevant aspects of the referential world, significantly reducing the poverty of the stimulus problem. Results from recent simulation studies are briefly reviewed that indicate how these mechanisms could then allow a relatively non-specific learning mechanism to acquire initial knowledge of grammatical constructions in the first steps of language acquisition.
Journal of Neurolinguistics 03/2004; 17(2-3-17):121-145. DOI:10.1016/S0911-6044(03)00056-3 · 1.49 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: is that the former can systematically lead to explicit negation. When performing explicit negation, one has to contrast the perceived object with some prototype and to exclude the for-mer from the latter. You may say this is not a flea, because it is too big, or this is not a star, because it is too big. R-predication alone, because it is holistic, cannot offer such explicit negation. It can only refuse class membership by measuring a holistic distance and comparing it to some threshold. Yet, every human being has the ability to perform explicit negation on any domain without any specific training. This endowment, which underlies the argumen-tative use of language, is a consequence of our ability to contrast perceptions and form C-predicates. We suggested elsewhere why this ability can be considered as one of the main cognitive differ-ences that distinguishes homo sapiens (Dessalles 2000). It offered a new adaptation, namely, the possibility to detect lies. By con-trasting one's own perception with the liar's report, one may not only disbelieve the report, but also offer an explicit reason why the report should be rejected. Similarly, because they can be system-atically negated, the predicates used in logical accounts of human thinking are C-predicates, not R-predicates. The fact that C-pred-icates can be used to make membership explicit (this is a galaxy, because it is big) may explain why they are mistakenly supposed to be necessary for any categorisation, and hence imprudently granted to animals. The scope of Hurford's argument is thus more limited than an-ticipated, because it cannot be extended to "genuine" predication, what we called C-predication. The author's insight and the com-prehensive line of argument that he draws from it could yet be ex-tended in some way to C-predication. Our ability to "locate" an object on the axis provided by the contrast operator may be an evo-lutionary derivative of the fundamental ability to handle location and property separately. To say that this apple is big or that it is bigger than that apple, we must assign positions to objects, not in physical space but on the contrast axis, which may be, in some cases, quite abstract. Abstract: In the context of Hurford's claim that "some feature of language structure maps onto a feature of primitive mental representations," I will argue that Hurford's focus on 1-place predicates as the basis of the "men-tal representations of situations in the world" is problematic, particularly with respect to spatiotemporal events. A solution is proposed. Hurford's claim that "some feature of language structure maps onto a feature of primitive mental representations" (target article, sect. 1) is clearly on the right track. However, I will argue that Hur-ford's focus on 1-place predicates as the basis of the "mental rep-resentations of situations in the world" is problematic. Specifically, I will propose that a more appropriate representation is based on the structure of perceptual events that are functionally and be-haviorally relevant to the nonlinguistic individual. Such events would include physical contact, transfer of possession of objects, and the like, that inherently consist of multiple-argument predi-cates. In developing this argument I will exploit Hurford's re-quirement that the characterization of an appropriate representa-tion should include: (1) "a plausible bridge between such representation and the structure of language" (sect. 1), (2) a char-acterization of "primitive mental representation" independent of language itself, and finally (3) a plausible story for the neural ba-sis of the representation. Mapping to language. With respect to the bridge between the representation and the structure of language, Hurford argues that "very little of the rich structure of modern languages directly mir-rors any mental structure in pre-existing language" (sect. 1.1). He further states that in contrast to the morphosyntactic complexity of language, the syntax of logical form is very simple. These com-ments reveal the shortcoming of P(x) as the representation – it is too simple. Indeed, it seems that by focusing on a representation that is appropriate for logic, Hurford steers off the course of a be-haviorally useful representational schema. Nonhuman primates likely have quite rich representations of events, their temporal structure, the individuals involved, and so forth. Constructing such representations in a neural first order predicate logic would be difficult. Indeed, the difficulty of the mapping is revealed by the quantity of effort expended in developing a theoretical basis for mapping logic to language and the meanings that can be ex-pressed in language (e.g., Kamp & Reyle 1993; Montague 1970; 1973; Parsons 1990). I suggest that although 1-place predicates are certainly useful for representing object properties, they are inappropriate for (and do not extend in a straightforward manner to) event representa-tions. Imagine instead that the prelinguistic representation was based on the perceptual structure of events, with ordered predi-cates yielding a structure something like "event(agent, object, re-cipient)." In this case the mapping from the mental representa-tion to language becomes more interesting and more iconic. The distinct ordered predicates in the event representation take on specific thematic roles that are iconicly reflected in regularities in word ordering and/or morphosyntactic and closed class structure in a cross-linguistic manner. In the example, "A man bites a dog," the representation: bite(e), man(x), dog(y), agent(x), patient(y) appears arbitrary, unordered, and less informative than bite(man, dog), in which the relations between the event and the constituent thematic roles (agent and patient) are encoded in the representa-tion. I would thus propose that the capability to represent 1-place predicates does not extend in a useful manner to (n 1)-place predicates for representing meaning. Characterization of the primitive mental representation. Hav-ing made this claim, one is obliged to demonstrate the psycholog-ical validity of (n 1)-place predicates independent of language. I will approach this from the perspective of (1) observations from developmental psychology and (2) studies of automatic perceptual analysis. From the developmental perspective, one of the most salient perceptual primitives (after motion) is contact or collision be-tween two objects (Kotovsky & Baillargeon 1998). Prelingual in-fants appear to represent collisions in terms of the properties of the "collider" and their influence on the "collidee." This supports (but does not prove) the hypothesis that contact is represented by a 2(or greater)-place predicate in prelingual infants. But is the n-place predicate computationally tractable? That is, is it reasonable to assume that nonlinguistic beings can construct such representations? I have recently explored this question by developing an automated system that extracts meaning from on-line video sequences of events performed by a human experi-menter in a simple setup involving manipulation of toy blocks. The objects are recognized and tracked in the video image, and phys-ical contact between two objects is easily detected in terms of a minimum distance threshold. The agent of the contact is then de-termined as the one of the two participants that has a greater rel-ative velocity toward the other in the contact. In this context, the event types of touch, push, give, and take can be defined as vari-ants or types of contact events (Dominey 2002; 2003). This demonstrates that sensitivity to a simple class of perceptual event (contact) can provide the basis for a multiple ordered predicate representation of event structure. A more general demonstration of how the perception of support, contact, and attachment can be used to learn the lexical semantics of verbs is provided by Siskind (2001). The objective of developing this perceptual scene analysis system was to demonstrate the feasibility of generating meaning in an event(agent, object, recipient) format, based on the percep-tion of physical contact. This was motivated by simulation studies Commentary/Hurford: The neural basis of predicate-argument structure BEHAVIORAL AND BRAIN SCIENCES (2003) 26:3 291 of language acquisition based on the learning of mappings be-tween grammatical structure and predicate-argument structures (Dominey 2000), that in turn was based on combined modeling and neurophysiological testing of the underlying functions (Dominey et al. 2003). These and subsequent studies revealed that the complexity of grammatical forms (e.g., relative phrases) corresponds to an anal-ogous complexity in the predicate-argument representational structure. For example, in mapping the grammatical construct, "The block that pushed the triangle touched the circle," onto the representation push(block, triangle), touch(block, circle), we can observe an iconic relation between the relativized structure of the sentence and the meaning representation in which the two events share a common agent: block (Dominey 2003). With respect to the neural basis of multiple argument predi-cates for representing events, one possibility can be found in the F5 neuron populations described by Rizzolatti and Arbib (1998), which, when observed together, allow distinct representations for grasp(me, raisin) versus grasp(someone-else, raisin). Thus, access to two distinct populations of these neurons allows a event repre-sentation with distinct agent and object coding. In summary, I want to insist that Hurford's undertaking is quite valid and interesting with respect to the stated goal of investigat-ing the neural basis of predicate-argument structure. Where it fails is in the thesis, "The structures of modern natural languages can be mapped onto these primitive representations." I hope to have argued that the required representations for events (and their description) are more complex than those described by Hur-ford – and that they cannot be represented by the primitive struc-ture he describes.
[Show abstract][Hide abstract] ABSTRACT: In order to understand the evolutionary pathway to the capability for language, we must first clearly understand the functional capabilities that the child brings to the task of language acquisition. Behavioral studies provide insight into infants’ ability to extract statistical and distributional structure directly from the auditory signal, and their capabilities to construct relations between this structure and the structure extracted from perceptual systems. At the interface of these two processes lies a conceptual scene representation that can be accessed by both, and that importantly provides a means for the two systems to constructively interact.
Recent studies have begun to make progress in simulating infants’ capabilities to extract statistical structure (e.g. word segmentation and lexical categorization) directly from the speech sound sequence. The current research examines how this structure interacts with perceptual structure at the level of the conceptualized scene. In particular we demonstrate how the grounding of words and sentences in conceptualized visual scenes permits the system to construct the appropriate relations between words and their referents, and sentences and theirs (structured conceptualizations of scenes representing agents, objects and actions) in the initial phases of acquisition of syntactic structure. These studies simulate behavioral observations of the trajectory of infants’ linguistic acquisition of concrete nouns, followed by concrete verbs and then more abstract nouns and verbs, in parallel with the development of first simple and then more complex syntactic structures. The relevance of these results to infant language acquisition behavior will be discussed. While this research yields interesting new results in characterizing the grounding of language in conceptualized scenes, it also identifies serious limitations of the current methods that will be discussed, along with the associated future extensions.
Evolution of Communication 03/2002; 4(1):57-85. DOI:10.1075/eoc.4.1.05dom
[Show abstract][Hide abstract] ABSTRACT: thought is underlyingly propositional, or mentalese (Fodor 1975). The second is that there is both mentalese and LF, but only LF does the work of cross-modular thinking. That would seem to be Carruthers's position. The third is that mentalese of sufficient complexity to handle propositional attitudes would have to be vir-tually identical to LF (de Villiers & Pyers 1997; Segal 1998). If so, why duplicate the functions and structures? Why not assume that natural language is the medium for such thinking, especially as LF rather than inner speech? We raised that among a list of other log-ical possibilities for the relationship between natural language and the language of thought in this domain (de Villiers & de Villiers 2000). Does Varley's aphasic contradict this possibility? Not necessar-ily, because LF could (logically) be preserved but inaccessible to the phonological input and output systems for language. Car-ruthers uses Varley's case study to deny that language is needed synchronically for false-belief reasoning, but that is because of his commitment to two other connected notions: (1) ToM is a mod-ule and (2) LF is only needed for cross-modular thinking. He is also tempted to say that animals have mental state representations, arguing that their "long chains" of social reasoning imply proposi-tional mentalese. This is where our behaviorist beginnings show. We haven't seen evidence from primates or younger children that would convince us to posit both propositional mentalese and LF, once you allow LF to be the medium for false belief reasoning. But Carruthers needs both if he only allows LF to be the medium of cross-modular thought. It's curious, because the arguments in favor of the subtlety of syntax and semantics needed to capture propositional attitudes seem to us so much more convincing than those needed to capture "left of the blue wall"! Carruthers has to avoid the conclusion that false belief reason-ing is dependent on language if he is to keep to the claim that it is a module. So he argues that the full theory of mind system, a mod-ule independent of language, comes on line at age four. But it is accelerated by interpreting linguistic input, which leaves us won-dering what might happen in the absence of complex linguistic in-put. This language-independent module would then come on line at what? 5 years? 8 years? 25 years? In addition, Carruthers states that the language-independent ToM module "has to access the re-sources of other systems (including the language faculty) to go about its work." Why? In particular, "mind-reading ability rou-tinely co-opts the resources of the language faculty." Is this be-cause it is routinely cross-modal? Maddeningly, Carruthers does not specify sufficiently which false belief tasks count as which type: The only example provided is one in which the subject be-lieves a proposition that is itself cross-modular, "that the object is to the left of the blue wall." So, it's all very well to "cry out for ex-perimental investigation," but only if it's clear enough to test. Perhaps what Carruthers has in mind is that a person without sufficient language, say, a three-year-old, can imagine the false be-lief of another, and can token it in some system of thought but not explicitly deduce consequences or predict behaviors from it. So, logically, the theory of mind system could be language-indepen-dent. That is the kind of picture that Clements and Perner (1994) posit for their toddlers who look expectantly at the place a char-acter will emerge premised on his false belief but then fail when asked the simple question, "Where will he look?" However, they argue that the children's expectancy might not be propositional at this point but behavioral (Dienes & Perner 1999). To answer explicitly, a propositional format must be developed. Carruthers believes that the standard false belief tasks require only intra-modular thinking, hence not natural language, though maybe propositional mentalese. But in the development of such reason-ing, he also admits that language plays a crucial role in input and output systems. So the difference comes down to this. Our own data are just what one would expect if the acquisition of complementation un-der verbs of communication and belief in language made possible the representation of the relationships between people's minds and false states of affairs, representations that were inaccessible to explicit reasoning or incomplete before. It sounds like a good idea to us to propose that something like the LF of natural language is the format for such thinking, because LF has the necessary rep-resentational richness. But we would still need to explain why LF of sufficient complexity takes time to develop. For all we know, se-vere aphasics might have access still to LF, but primates would not. That is not to say there are many other subtle things that can be done (even in mind reading) without LF, and it is an exciting question to ask if such things really need propositional reasoning. Much experimental and philosophical ingenuity will be required (Dennett 1983)! Abstract: In Carruthers's formulation, cross-domain thinking requires translation of domain specific data into a common format, and linguistic LF thus plays the role of the common medium of exchange. Alternatively, I propose a process-oriented characterization, in which there is no com-mon representation and cross-domain thinking is rather the process of es-tablishing mappings across domains, as in the process of analogical rea-soning. Carruthers proposes that cross-modular thinking consists of the integration of central-process modules' outputs by the language faculty to build logical form (LF) representations, which thus combine information across domains, and that "all cross-modular thinking consists in the formation and manipulation of these LF representations (sect. 5.1, para. 7)." I will argue that cross-domain thinking can occur without intervention of the language faculty. Rather, such thinking relies on a generalized cross-domain map-ping capability. Interestingly, this type of mapping capability can operate across diverse domains, including the mapping required for performing the transformation from sentences to meanings in language processing. In Carruthers's formulation, cross-domain thinking requires translation of domain specific data into a common format, and linguistic LF thus plays the role of the common cross-domain medium of exchange. Alternatively, we can consider a process-ori-ented characterization, in which there is no common representa-tion, and cross-domain thinking is rather the process of establish-ing mappings or transformations across domains, as in the process of analogical reasoning. We can gain insight into this issue of cross-domain processing from its long tradition in the sensorimotor neurosciences. Con-sider the problem of cross-domain coordination required for vi-sually guided reaching to an object. The retinal image is combined with information about position of the eye in the orbit, and the ori-entation of the head with respect to the body to determine the po-sition of the object in space with respect to the body. This sensory domain representation is then used to command the arm reach that should be specified in the native motor system coordinates of the individual muscles. Interestingly, Kuperstein (1988) demon-strated that this cross-domain problem could be solved without invoking common representation format but rather by construct-ing a direct mapping from sensory to motor system coordinates. Can an analogous mapping strategy be used for cross-domain thinking? In response to this question, I will illustrate a form of transformation processing for the mapping of grammatical struc-ture in language to conceptual structure and then will demon-strate how this mapping capability extends to generalized cross-domain mapping, making this point with the analogical reasoning. Commentary/Carruthers: The cognitive functions of language BEHAVIORAL AND BRAIN SCIENCES (2002) 25:6 683 A central function of language is communicating "who did what to whom," or thematic role assignment. In this context, consider the two sentences in which the open class words are labeled. a. John(1) hit(2) the ball(3). b. It was the ball(3) that John(1) hit(2). Both of these sentences correspond to the meaning encoded by the predicate hit (agent, object), instantiated as labeled hit(2) (John(1), ball(3)). For each sentence, the structural mapping from open class words onto event and thematic role structure in the meaning is straightforward (123–213, and 312–213 for sentences (a) and (b), respectively). The difficulty is that the particular map-ping is different for different sentence types. This difficulty is re-solved by the property that different sentence types have differ-ent patterns of grammatical function words (or morphemes) that can thus identify and indicate the appropriate sentence, mean-ing mapping for each sentence type. Based on this mapping/ transformation characterization, we suggested that nonlinguistic cognitive sequencing tasks that require application of systematic transformations guided by "function" markers would engage lan-guage-related mapping processes. Indeed, in these tasks we ob-served (1) language-related ERP profiles in response to the func-tion markers (Hoen & Dominey 2000), (2) correlations between linguistic and nonlinguistic transformation processing in aphasics (Dominey et al. 2003), and (3) transfer of training across these do-mains (Hoen et al. 2002). These data argue for the existence of a generalized transformation processing capability that can extend across domains and is thus a candidate for a cross-domain think-ing mechanism. Within this structural mapping context, Holyoak and colleagues (Gick & Holyoak 1983) have studied the process of analogical mapping in reasoning. A classic example involves the "conver-gence" schema. Consider: A general must attack a fortress at the center of a town. His army is too large to approach the fort by any one of the many paths that converge on the fort. He divides his army into small groups, each converging simultaneously on the fort. Now consider: A doctor must eliminate a tumor in a patient's thorax. The doctor has a radiation beam that can destroy the tu-mor, but at full strength, it will destroy the intervening tissue as well. Gick and Holyoak (1983) demonstrated that subjects could use the analogical mapping to solve the radiation problem. This analogical reasoning process does not appear to rely on translation into language or a propositional representation. Rather, we can consider that it is based on mapping of the target problem onto a nonpropositional spatial image schema of the analog problem. Thus, we can consider that not all cross-modular thinking is propositional. Similarly, when physics students discover that re-sistor-capacitor circuits behave like the physical mass-spring sys-tems they have studied, an analogical mapping process is triggered that yields a number of new insights. These can potentially be ex-pressed in language, but they do not originate in any language-re-lated format. On the contrary, the structural properties required for analogical mapping would likely be lost in the LF conversion. A related form of nonpropositional cross-domain reasoning has been well explored in the mental models paradigm by Johnson-Laird (1996). In summary, I have the impression that Carruthers has overex-tended the original function of LF as an interface between lan-guage and conceptual systems. It appears implicit in Carruthers's theory that all cross-domain thinking must be propositional (or must be of the type that can be realized by the language faculty). "All" is a strong word. The cross-domain analogical mapping ex-amples above define cases where cross-domain interaction cannot occur via a propositional LF-like data structure. Rather, these cases require mapping processes that establish the cross-domain correspondences, independent of a neutral common representa-tion.