Chen Yu

Chen Yu
Indiana University Bloomington | IUB · Department of Psychological and Brain Sciences

PhD

About

190
Publications
33,469
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,774
Citations

Publications

Publications (190)
Article
Toddlers learn words in the context of speech from adult social partners. The present studies quantitatively describe the temporal context of parent speech to toddlers about objects in individual real-world interactions. We show that at the temporal scale of a single play episode, parent talk to toddlers about individual objects is predominantly, b...
Preprint
Early language learning relies on statistical regularities that exist across timescales in infants’ lives. Two types of these statistical regularities are the routine activities that make up their day, such as mealtime and play, and the real-time repeated behaviors that make up the moment-by-moment dynamics of those routines. These two types of reg...
Article
Children’s ability to share attention with another person (i.e., achieve joint attention) is critical for learning about their environments in general1, 2, 3 and supporting language and object word learning in particular.¹,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 While joint attention (JA) as it pertains to autism spectrum disorder (ASD) is often more...
Article
Full-text available
This study demonstrates evidence for a foundational process underlying active vision in older infants during object play. Using head-mounted eye-tracking and motion capture, looks to an object are shown to be tightly linked to and synchronous with a stilled head, regardless of the duration of gaze, for infants 12 to 24 months of age. Despite being...
Article
This research takes a dyadic approach to study early word learning and focuses on toddlers’ (N = 20, age: 17–23 months) information seeking and parents’ information providing behaviors and the ways the two are coupled in real‐time parent–child interactions. Using head‐mounted eye tracking, this study provides the first detailed comparison of childr...
Preprint
Infants learn the meaning of words from accumulated experiences of real-time interactions with their caregivers. To study the effects of visual sensory input on word learning, we recorded infant's view of the world using head-mounted eye trackers during free-flowing play with a caregiver. While playing, infants were exposed to novel label-object ma...
Article
Full-text available
Multimodal exploration of objects during toy play is important for a child’s development and is suggested to be abnormal in children with autism spectrum disorder (ASD) due to either atypical attention or atypical action. However, little is known about how children with ASD coordinate their visual attention and manual actions during toy play. The c...
Article
Social interactions provide a crucial context for early learning and cognitive development during infancy. Action prediction—the ability to anticipate an observed action—facilitates successful, coordinated interaction and is an important social-cognitive skill in early development. However, current knowledge about infant action prediction comes lar...
Conference Paper
Full-text available
To acquire the meaning of a verb, language learners not only need to find the correct mapping between a specific verb and an action or event in the world, but also infer the underlying relational meaning that the verb encodes. Most verb naming instances in naturalistic contexts are highly ambiguous as many possible actions can be embedded in the sa...
Conference Paper
Full-text available
People have foveated vision and thus are generally able to attend to just a single object within their field of view at a time. Our goal is to learn a model that can automatically identify which object is being attended, given a person's field of view captured by a first person camera. This problem is different from traditional salient object detec...
Chapter
In this chapter, we introduce recent research using head-mounted eye-trackers to record sensory-motor behaviors at a high resolution and examine parent-child interactions at a micro-level. We focus on one important research topic in early social and cognitive development: how young children and their parents coordinate their visual attention in soc...
Preprint
Full-text available
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences. Researchers in cognitive science and developmental psychology have built formal models that implement in-principle learning algorithms, and then used pre-selected and pre-cleaned datasets to test the abi...
Preprint
Statistical learning is an active process wherein information is actively selected from the learning environment. As current information is integrated with existing knowledge, it shapes attention in subsequent learning, placing biases on which new information will be sampled. One statistical learning task that has been studied recently is cross-sit...
Conference Paper
Full-text available
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences. Researchers in cognitive science and developmental psychology have built formal models that implement in-principle learning algorithms, and then used pre-selected and pre-cleaned datasets to test the abi...
Conference Paper
Full-text available
Human infants have the remarkable ability to learn the associations between object names and visual objects from inherently ambiguous experiences. Researchers in cognitive science and developmental psychology have built formal models that implement in-principle learning algorithms, and then used preselected and pre-cleaned datasets to test the abil...
Conference Paper
Sustained attention (SA) is a critical skill in which a child is able to maintain visual attention to an object or stimulus. The current study employs head-mounted eye trackers to study the cognitive processes underlying SA by analyzing micro-level behaviors during parent-child social interactions in both typically and atypically developing childre...
Article
Coordinated attention between children and their parents plays an important role in their social, language, and cognitive development. The current study used head‐mounted eye‐trackers to investigate the effects of children's prelingual hearing loss on how they achieve coordinated attention with their hearing parents during free‐flowing object play....
Preprint
Full-text available
Due to the foveated nature of the human vision system, people can focus their visual attention on a small region of their visual field at a time, which usually contains only a single object. Estimating this object of attention in first-person (egocentric) videos is useful for many human-centered real-world applications such as augmented reality app...
Conference Paper
Full-text available
Toddlers and their parents achieve joint attention in many different social contexts. In some contexts, parents follow toddlers’ attention; in other contexts, toddlers follow parents. Using a dual head-mounted eye-tracking paradigm and microlevel analyses of behavior, we examined the sensorimotor properties of parent-toddler joint attention both in...
Article
Full-text available
Researchers from the complex dynamical systems perspective seek their explanations of human behavior and development in the dynamical interactions across many levels in an active, situated individual. That is to say, behavior and development are both constraining and constrained by the continuous exchange between a myriad of processes distributed a...
Conference Paper
Developmental theory considers action prediction as one of several processes involved in determining how infants come to perceive and understand social events (Gredebäck & Daum, 2015). Action prediction is observed from early in life and is considered an important social-cognitive skill. However, knowledge about infant action prediction is limited...
Preprint
Full-text available
Inspired by the remarkable ability of the infant visual learning system, a recent study collected first-person images from children to analyze the `training data' that they receive. We conduct a follow-up study that investigates two additional directions. First, given that infants can quickly learn to recognize a new object without much supervision...
Article
Children's attentional state during parent-child interactions is important for word learning. The current study examines the real-time attentional patterns of toddlers with and without hearing loss (N = 15, age range: 12–37 months)in parent-child interactions. High-density gaze data recorded from head-mounted eye-trackers were used to investigate t...
Preprint
What we learn about the world is affected by the input we receive. Many extant category learning studies use uniform distributions as input in which each exemplar in a category is presented the same number of times. Another common assumption on input used in previous studies is that exemplars from the same category form a roughly normal distributio...
Article
The present research studied children in the second year of life (N = 29, M age = 21.14 months, SD = 2.64 months) using experimental manipulations within and between subjects to show that responsive parental influence helps children have more frequent sustained object holds with fewer switches between objects compared to when parents are either not...
Article
Full-text available
Understanding communications across disciplines is critical to the promotion of interdisciplinary innovation. Using research into the psychological concept of joint attention (JA) as an example of shared interest, this paper focuses on studies of JA across three different domains (child psychology, robotics, and human–computer interaction) and anal...
Article
Parent–child interactions are multimodal, often involving coordinated exchanges of visual and auditory information between the two partners. The current work focuses on the effect of children's hearing loss on parent–child interactions when parents and their toddlers jointly played with a set of toy objects. We compared the linguistic input receive...
Article
Object names are a major component of early vocabularies and learning object names depends on being able to visually recognize objects in the world. However, the fundamental visual challenge of the moment‐to‐moment variations in object appearances that learners must resolve has received little attention in word learning research. Here we provide th...
Chapter
Word learning happens in everyday contexts with many words and many potential referents for those words in view at the same time. It is challenging for young learners to find the correct referent upon hearing an unknown word at the moment. This problem of referential uncertainty has been deemed as the crux of early word learning (Quine, 1960). Rece...
Article
The uncertainty of reference has long been considered a key challenge for young word learners. Recent studies of head camera wearing toddlers and their parents during object play have revealed that from toddlers' views, the referents of parents' object naming are often visually quite clear. Although these studies have promising theoretical implicat...
Conference Paper
Full-text available
Real-world learning systems have practical limitations on the quality and quantity of the training datasets that they can collect and consider. How should a system go about choosing a subset of the possible training examples that still allows for learning accurate, generalizable models? To help address this question, we draw inspiration from a high...
Article
Full-text available
Parents support and scaffold more mature behaviors in their infants. Recent research suggests that parent–infant joint visual attention may scaffold the development of sustained attention by extending the duration of an infant’s attention to an object. The open question concerns the parent behaviors that occur within joint-attention episodes and su...
Article
Young children's visual environments are dynamic, changing moment-by-moment as children physically and visually explore spaces and objects and interact with people around them. Head-mounted eye tracking offers a unique opportunity to capture children's dynamic egocentric views and how they allocate visual attention within those views. This protocol...
Article
Full-text available
Human infants interact with the environment through a growing and changing body and their manual actions provide new opportunities for exploration and learning. In the current study, a dynamical systems approach was used to quantify and characterize the early motor development of limb effectors during bouts of manual activity. Many contemporary the...
Conference Paper
The recent availability of lightweight, wearable cameras allows for collecting video data from a "first-person' perspective, capturing the visual world of the wearer in everyday interactive contexts. In this paper, we investigate how to exploit egocentric vision to infer multimodal behaviors from people wearing head-mounted cameras. More specifical...
Article
Full-text available
Vocabulary differences early in development are highly predictive of later language learning as well as achievement in school. Early word learning emerges in the context of tightly coupled social interactions between the early learner and a mature partner. In the present study, we develop and apply a novel paradigm—dual head‐mounted eye tracking—to...
Article
Understanding the formation of interdisciplinary research (IDF) is critically important for the promotion of interdisciplinary development. In this paper, we adopt extracted keywords to investigate the features of interdisciplinarity development, as well as the distinct roles that different participating domains play in various periods, and detect...
Article
Full-text available
New efforts are using head cameras and eye-trackers worn by infants to capture everyday visual environments from the point of view of the infant learner. From this vantage point, the training sets for statistical learning develop as the sensorimotor abilities of the infant develop, yielding a series of ordered datasets for visual learning that diff...
Conference Paper
Full-text available
Recent advances in wearable camera technology have led many cognitive psychologists to study the development of the human visual system by recording the field of view of infants and toddlers. Meanwhile, the vast success of deep learning in computer vision is driving researchers in both disciplines to aim to benefit from each other's understanding....
Conference Paper
Full-text available
Toddlers quickly learn to recognize thousands of everyday objects despite the seemingly suboptimal training conditions of a visually cluttered world. One reason for this success may be that toddlers do not just passively perceive visual information, but actively explore and manipulate objects around them. The work in this paper is based on the idea...
Article
Being able to learn word meanings across multiple scenes consisting of multiple words and referents (i.e., cross-situationally) is thought to be important for language acquisition. The ability has been studied in infants, children, and adults, and yet there is much debate about the basic storage and retrieval mechanisms that operate during cross-si...
Conference Paper
Full-text available
Sustained visual attention is crucial to many developmental outcomes. We demonstrate that, consistent with the developmental systems view, sustained visual attention emerges from and is tightly tied to sensory motor coordination. We examined whether changes in manual behavior alter toddlers' eye gaze by giving one group of children heavy toys that...
Conference Paper
Full-text available
Human interaction involves the organization of a collection of sensorimotor systems across space and time. The study of how coordination develops in child-parent interaction has primarily focused on understanding the development of specific coordination patterns from individual modalities. However, less work has taken a systems view and investigate...
Article
Objects in the world usually have names at different hierarchical levels (e.g., beagle, dog, animal). This research investigates adults' ability to use cross-situational statistics to simultaneously learn object labels at individual and category levels. The results revealed that adults were able to use co-occurrence information to learn hierarchica...
Article
Full-text available
The present article shows that infant and dyad differences in hand-eye coordination predict dyad differences in joint attention (JA). In the study reported here, 51 toddlers ranging in age from 11 to 24 months and their parents wore head-mounted eye trackers as they played with objects together. We found that physically active toddlers aligned thei...
Article
Toddlers learn object names in sensory rich contexts. Many argue that this multisensory experience facilitates learning. Here, we examine how toddlers’ multisensory experience is linked to another aspect of their experience associated with better learning: the temporally extended nature of verbal discourse. We observed parent–toddler dyads as they...
Article
Full-text available
We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered w...
Article
Forensic evidence often involves an evaluation of whether two impressions were made by the same source, such as whether a fingerprint from a crime scene has detail in agreement with an impression taken from a suspect. Human experts currently outperform computer-based comparison systems, but the strength of the evidence exemplified by the observed d...
Article
Natural language environments usually provide structured contexts for learning. This study examined the effects of semantically themed contexts—in both learning and retrieval phases—on statistical word learning. Results from 2 experiments consistently showed that participants had higher performance in semantically themed learning contexts. In contr...
Article
Two experiments were conducted to examine adult learners' ability to extract multiple statistics in simultaneously presented visual and auditory input. Experiment 1 used a cross-situational learning paradigm to test whether English speakers were able to use co-occurrences to learn word-to-object mappings and concurrently form object categories base...
Conference Paper
Infants learn their first object names by linking heard names to scenes. A core theoretical problem is how infants select the right referent from cluttered and ambiguous scenes. Here we show how the distributional properties of objects in young infants' visual experiences may help solve this core problem in early word learning. Infant perspective s...
Conference Paper
Full-text available
During early visual development, the infant's body and actions both create and constrain the experiences on which the visual system grows. Evidence on early motor development suggests a bias for acting on objects with the eyes, head, trunk, hands, and object aligned at midline. Because these sensory-motor bodies structure visual input, they may als...
Conference Paper
Many previous studies have shown that both infants and adults are skilled statistical learners. Because statistical learning is affected by attention, learners' ability to manage their attention can play a large role in what they learn. However, it is still unclear how learners allocate their attention in order to gain information in a visual envir...
Conference Paper
Full-text available
Humans, as social beings, are capable of employing various behavioral cues, such as gaze, speech, manual action, and body posture, in everyday communication. However, to extract fine-grained interaction patterns in social contexts has been presented with methodological challenges. Cross-Recurrence Plot Quantification Analysis (CRQA) is an analysis...
Conference Paper
Child-directed speech is often temporally organized such that successive utterances refer to the same topic. This type of extended discourse on the same referent has been shown to possess several verbal signatures that could facilitate learning. Here, we reveal multiple non-verbal correlates to extended discourse that could also aid learning. Multi...
Conference Paper
Full-text available
Early visual object recognition in a world full of cluttered visual information is a complicated task at which toddlers are incredibly efficient. In their everyday lives, toddlers constantly create learning experiences by actively manipulating objects and thus self-selecting object views for visual learning. The work in this paper is based on the h...
Article
Full-text available
We focus on a fundamental looking behavior in human-robot interactions-gazing at each other's face. Eye contact and mutual gaze between two social partners are critical in smooth human-human interactions. Therefore, investigating at what moments and in what ways a robot should look at a human user's face as a response to the human's gaze behavior i...
Article
The ability to sustain attention is a major achievement in human development and is generally believed to be the developmental product of increasing self-regulatory and endogenous (i.e., internal, top-down, voluntary) control over one's attention and cognitive systems [1-5]. Because sustained attention in late infancy is predictive of future develo...
Article
Joint attention has been extensively studied in the developmental literature because of overwhelming evidence that the ability to socially coordinate visual attention to an object is essential to healthy developmental outcomes, including language learning. The goal of this study was to understand the complex system of sensory-motor behaviors that m...
Article
Full-text available
Prior research has shown that people can learn many nouns (i.e., word-object mappings) from a short series of ambiguous situations containing multiple words and objects. For successful cross-situational learning, people must approximately track which words and referents co-occur most frequently. This study investigates the effects of allowing some...
Conference Paper
Full-text available
Hands appear very often in egocentric video, and their appearance and pose give important cues about what people are doing and what they are paying attention to. But existing work in hand detection has made strong assumptions that work well in only simple scenarios, such as with limited interaction with other people or in lab settings. We develop m...
Conference Paper
Full-text available
Wearable devices are becoming part of everyday life, from first-person cameras (GoPro, Google Glass), to smart watches (Apple Watch), to activity trackers (FitBit). These devices are often equipped with advanced sensors that gather data about the wearer and the environment. These sensors enable new ways of recognizing and analyzing the wearer's eve...
Conference Paper
Natural linguistic environments usually provide structured input, in that words that are semantically-related are likely to occur in the same situation. The current study examined whether this kind of semantically-themed structure facilitated cross-situational word learning. Results from two experiments consistently showed that participants had hig...
Article
Full-text available
In the word-learning domain, both adults and young children are able to find the correct referent of a word from highly ambiguous contexts that involve many words and objects by computing distributional statistics across the co-occurrences of words and referents at multiple naming moments (Yu & Smith, 2007; Smith & Yu, 2008). However, there is stil...
Conference Paper
Full-text available
The current study investigated eye-hand coordination in natural reaching. We asked whether the speed of reaching related to the quality of visual information obtained by young children and adults. Participants played with objects on a table while their eye and hand movements were recorded. We developed new techniques to find reaching events in natu...
Article
An understanding of human collaboration requires a level of analysis that concentrates on sensorimotor behaviors in which the behaviors of social partners continually adjust to and influence each other. A suite of individual differences in partners' ability to both read the social cues of others and to send effective behavioral cues to others creat...
Article
Full-text available
Head-mounted video cameras (with and without an eye camera to track gaze direction) are being increasingly used to study infants’ and young children's visual environments and provide new and often unexpected insights about the visual world from a child's point of view. The challenge in using head cameras is principally conceptual and concerns the m...
Conference Paper
Full-text available
Cross-situational learning, the ability to learn word meanings across multiple scenes consisting of multiple words and referents, is thought to be an important tool for language acquisition. The ability has been studied in infants, children, and adults, and yet there is much debate about the basic storage and retrieval mechanisms that operate durin...
Conference Paper
Full-text available
The process of learning a language requires that long-term memory stores the meanings of thousands of words encoun-tered across a variety of situations. These word meanings form a network of associations that, influenced by environ-mental factors such as word frequency and contextual diver-sity, cause behavioral effects on measures such as lexical...
Conference Paper
Full-text available
Understanding visual attention in children could yield insight into how the visual system develops during formative years and how children's overt attention plays a role in development and learning. We are particularly interested in the role of hands and hand activities in children's visual attention. We use head-mounted cameras to collect egocentr...
Conference Paper
Full-text available
Learning and interaction are viewed as two related but distinct topics in developmental robotics. Many studies focus solely on either building a robot that can acquire new knowledge and learn to perform new tasks, or designing smooth human-robot interactions with pre-acquired knowledge and skills. The present paper focuses on linking language learn...
Conference Paper
Full-text available
Egocentric cameras are becoming more popular, introducing increasing volumes of video in which the biases and framing of traditional photography are replaced with those of natural viewing tendencies. This paradigm enables new applications, including novel studies of social interaction and human development. Recent work has focused on identifying th...
Article
Full-text available
ExpertEyes is a low-cost, open-source package of hardware and software that is designed to provide portable high-definition eyetracking. The project involves several technological innovations, including portability, high-definition video recording, and multiplatform software support. It was designed for challenging recording environments, and all p...