ArticlePDF Available

Abstract and Figures

We use a longitudinal design to examine associations for 2,120 16-30 months old children between early expressive vocabulary and later reading and math outcomes in the 6th Grade based on a large and diverse sample of Danish children. Educational outcomes, in particular decoding and reading comprehension, can be predicted from an early vocabulary measure as early as 16 months with effect sizes (in proportion of variance accounted for) comparable to one year’s mean growth in reading scores. The findings confirm in a relatively large population based study that late talkers are at risk for later educational attainment as the majority of children experiencing early language delay obtain scores below average in measures of reading in the 6th Grade. Low scores have the greatest predictive power indicating that children with early delays have elevated risk for later reading problems.
Content may be subject to copyright.
Applied Psycholinguistics, page 1 of 16, 2016
Early productive vocabulary predicts
academic achievement 10 years later
Aarhus University
University of Southern Denmark
University of New Mexico
Aarhus University
University of Southern Denmark
Received: February 18, 2015 Accepted for publication: December 24, 2015
Dorthe Bleses, TrygFonden’s Centre for Child Research and School of Communication and Culture,
Fuglesangs All´
e 4, Building 2630, Aarhus 8210, Denmark. E-mail:
We use a longitudinal design to examine associations for a diverse sample of 2,120 Danish 16- to
30-month-old children between early expressive vocabulary and later reading and math outcomes in
the sixth grade. Educational outcomes, in particular decoding and reading comprehension, can be
predicted from an early vocabulary measure as early as 16 months with effect sizes (in proportion of
variance accounted for) comparable to 1 year’s mean growth in reading scores. The findings confirm
in a relatively large population-based study that late talkers are at risk for low educational attainment
because the majority of children experiencing early language delay obtain scores below average in
measures of reading in the sixth grade. Low scores have the greatest predictive power, indicating that
children with early delays have elevated risk for later reading problems.
The development of reliable and valid parent-report measures of child language
has facilitated many advances in basic research and clinical practice. The two
instruments of this type that are most widely used are Rescorla’s Language Devel-
opment Survey (Rescorla, 1989,2013) and the MacArthur–Bates Communicative
Development Inventories (CDI; Fenson et al., 2007). They take advantage of the
much greater evidence available to parents and the fact that it is more representative
of the child’s full experience and language competence. Both of these instruments,
but especially the CDI, have been adapted to numerous languages as diverse as
© Cambridge University Press 2016 0142-7164/16
Applied Psycholinguistics 2
Bleses et al.: Early vocabulary and later achievement in school
Spanish, Danish, Russian, Chinese, American Sign Language, and others (Dale,
Penfold, & Fenson, 2011;see
These measures have many potential clinical and research uses, such as screen-
ing for early delay, identifying children with unusual profiles, and examination of
the influence of other biological and behavioral variables on language develop-
ment (Fenson et al., 2007). Almost all of the potential uses of these early language
measures rest on the assumption that individual differences have a long-term sig-
nificance. For that reason, they can help identify children who are at elevated risk
later in life and therefore merit early intervention, or within the normal range,
they provide illuminating evidence for the sources of later individual differences
in a range of skills and therefore contribute to theoretical understanding of those
The focus of the present paper is on predictions from measures of vocabulary
obtained between 16 and 30 months of age to language, literacy, and mathematics
outcomes in the later school years. The term “predictions” here refers both to
correlational and regression examination of relationships across the full range of
ability and to the outcomes of early delay. It is only fairly recently that we have had
valid measures for this early period that can be obtained for large samples of young
children and that enable examination of potential prediction from such early ages.
This study’s restriction to school-age outcomes reflects the fact that there is al-
ready a substantial body of research on longitudinal relations within the preschool
period (for reviews, see Fenson et al., 2007;Law&Roy,2008; Lee, 2011). For
example, Lee (2011) conducted a secondary analysis of data from the National
Institute of Child Health and Human Development Early Child Care Research
Network (2005). Parent-reported 24-month vocabulary on the CDI: Words and
Sentences was a moderately strong predictor (r=.36) of the Preschool Lan-
guage Scales auditory comprehension and expressive communication subscales
at 54 months. Hayiou-Thomas, Dale, and Plomin (2012) analyzed parent-report
measures (short-form adaptations of the CDI, providing a composite vocabulary–
grammar measure) obtained at 2, 3, and 4 years in the Twins Early Development
Study. Predictions from 2 to 3, from 3 to 4, and from 2 to 4, were .59, .63, and
.48, respectively. Bornstein, Hahn, Putnick, and Suwalsky (2014) observed corre-
lations in a similar range (.25–.49) from 20-month vocabulary to several 48-month
language measures. With respect to the outcomes of early delay, defined by low
scores relative to norms, a substantial body of research (see Rescorla & Dale, 2013,
for several reviews of this literature) has confirmed great variability in outcome.
Although these “late talkers” (child’s vocabulary falling within the lowest 10%
of children at that age) are at considerable risk for later problems, many of them,
perhaps half, will have moved into the normal range by the end of the preschool
period, at least on measures of vocabulary and grammar. However, performance
often remains below the mean even after this “catch-up,” and those children are
therefore at continuing risk, though no greater than other children in the low normal
range at the same age (Dale, McMillan, Hayiou-Thomas, & Plomin, 2014).
In contrast, much less evidence is available for predictions from early measures
into the school years. Three studies have reported correlations across this time
span for individual differences in oral language with large, representative English-
speaking samples. Lee (2011) found that 24-month vocabulary predicted Grades
Applied Psycholinguistics 3
Bleses et al.: Early vocabulary and later achievement in school
1, 3, and 5 Woodcock–Johnson Psycho-Educational—Revised (WJ-R) picture vo-
cabulary scores at .27, .27, and .25, respectively. Hayiou-Thomas et al. (2012)
found that their age 2 vocabulary–grammar composite predicted 10-year vocabu-
lary and 12-year vocabulary, syntax, figurative language, and making inferences at
.21, .18, .17, .20, and .12, respectively. Predictions from age 3 were only slightly
higher. Bornstein et al. (2014) found correlations in the range of .20–.33 from
20-month vocabulary to age 10 verbal intelligence measures. In addition, Hohm,
Jennen-Steinmetz, Schmidt, and Laucht (2007) found in a small sample of German
children that scores on the Receptive–Expressive Emergent Language Scale at 10
months were significantly related to several measures of language performance
(including test of vocabulary, sound blending, spelling, and a sentence repetition
test) and grades at age 11 years, but almost exclusively for girls (median r=.46;
for boys, median r=.10). The sample for this study was constructed to have high
variability in risk.
One of the most important reasons for concern about early language is its poten-
tial significance for the acquisition of reading, which in turn is the most important
determinant of educational success. Reading skills are substantially based on oral
language skills, especially but not only vocabulary (National Early Literacy Panel,
2008). However, as noted by Lee (2011), few studies of the emergence of literacy
have extended beyond the period from preschool or kindergarten through second
grade. In Lee’s study, early vocabulary predicted third grade WJ-R letter–word
identification, word attack, and passage comprehension at .22, .12, and .28, respec-
tively; at fifth grade, the correlations for letter–word identification and passage
comprehension were .23 and .25. Within the Twins Early Development Study,
Harlaar, Hayiou-Thomas, Dale, and Plomin (2008) examined predictions from
early language measures to teacher ratings of reading ability, using a nationwide
evaluation rubric based on the UK National Curriculum goals for literacy. Lan-
guage scores at 2 years predicted reading at ages 7, 9, and 10 at .23, .26, and .29,
respectively. Again, prediction from age 3 was only slightly higher.
Few studies of late talkers have followed children into the school years, and
they are all studies of small samples. The most comprehensive long-term follow
up of children identified as late talkers below the age of 2 is Rescorla’s (2013)
longitudinal study. Children identified as late talkers at 24–31 months were com-
pared at later ages with a sample matched for age, socioeconomic status (SES),
and nonverbal ability (N=40 and 39, respectively). They were assessed at ages
8, 9, 13, and 17 on language, cognitive, and academic achievement. In most cases,
the two groups differed significantly, with effect sizes generally in the range of
0.5 to 1.0, though sometimes even larger. Of note, later differences were observed
for all aspects of language assessed: vocabulary, grammar, narratives, and verbal
Rescorla’s study also provides the only long-term outcome information on
literacy in late talkers. Late talkers scored lower at age 8 on a composite of WJ-
R letter–word identification and passage comprehension, and on a measure of
reading comprehension but not decoding at age 13. There were no differences
on the reading measures at age 17 despite the differences that remained on oral
language measures. Rescorla notes that very few of the late talkers scored below
the 10th percentile on any of the language or reading measures at age 17.
Applied Psycholinguistics 4
Bleses et al.: Early vocabulary and later achievement in school
Taken together, the limited body of evidence presented above suggests that mea-
sures as early as the third year of life have continuing predictive significance into
the second decade, though there are some exceptions to this result. Nevertheless,
there are some specific and important gaps in the present knowledge base. First,
correlational studies of the entire distribution are with the exception of Hohm et
al.’s study of German children, to our knowledge, available only for English, and
mainly in a White, US, UK, or Australian population with exceptions such as
the National Institute of Child Health and Human Development study (National
Institute of Child Health and Human Development Early Child Care Research
Network, 2005). Languages with different structures, such as greater versus lesser
morphological complexity, might show differences in patterns of prediction, as
might societies with differing amounts of SES variability or different daycare or
school systems. Second, the primary analytic methods used have been correla-
tions and regressions, which assume a linear relationship for the prediction. It
would be useful to know if the early measures are more sensitive predictors of
outcomes in particular parts of the scale. Studies of late talkers are most often
based on relatively small clinically referred samples, or children parent referred as
late talkers from the general population. This recruitment process can miss many
children, perhaps especially those with a different profile. Another limitation to
this design is that there is generally not an explicit comparison with the outcomes
of other groups, such as the next 10%, the median 10%, or the top 10%. A few
studies, however, such as Hayiou-Thomas et al. (2012), identify late talkers using
a statistical criterion (usually the lowest 10%) within a population-representative
sample. Such a design has the power to overcome these limitations. Unlike the
overall correlational studies, late talker research has been conducted in several
languages, but none of them extend beyond the preschool period.
Studies of individual differences in early language development have consis-
tently found small gender differences favoring girls (Eriksson et al., 2012). Little
is known about gender differences in the stability of language and literacy, given
the absence of longitudinal studies with adequate sample size to examine gender
The present study is designed to address some of these limitations by obtaining
follow-up information in sixth grade (approximately 12 years of age) on a large and
diverse sample of Danish children who were assessed in the Danish CDI norming
study (Bleses et al., 2008). The Danish National Test system was implemented in
the public schools in Denmark in 2010. The test system includes 10 mandatory tests
in six subjects, administered individually to children from second grade through
eighth grade, and offers a unique opportunity to follow academic achievement of
Danish children through the period of compulsory schooling.
The Danish context is a particularly interesting one for the study of longi-
tudinal relationships. More than 95% of 3- to 5-year-olds are in daycare (http:// Childcare in Denmark (the
Applied Psycholinguistics 5
Bleses et al.: Early vocabulary and later achievement in school
term preschool is generally not used) has traditionally, and uniformly, focused
on supporting children’s social skills rather than early academic skills. These
facts can be seen as specific aspects of the lower degree of SES variability in
Denmark compared to the United States overall (Organization for Economic
Cooperation and Development [OECD], 2014). According to 2015 data from
the OECD Centre for Opportunity and Equality (
inequality.htm), Denmark has the lowest gini coefficient among OECD coun-
tries, and the lowest income difference ratio of 5.2 between the richest 10% and
poorest 10% (cf. OECD mean =9.6; United States =18.8). This lower degree of
SES variability may entail less environmental variability and influence on individ-
ual differences overall during childhood. It is possible that longitudinal prediction
might be stronger in this context. However, there have been no published longitu-
dinal studies of Danish children from the early stages into the school years.
Finally, Danish is for other reasons a good comparison to the comprehensive
English literature because Danish and English are similar in important ways. Both
Danish and English are Germanic languages. In addition, like English, Danish has
a somewhat deep orthography, mainly due to substantial changes in the spoken
language that date back to the early middle ages and to a substantial influx of
foreign words and their spelling from other languages (Elbro, 2006). Furthermore,
initial reading in Danish is traditionally taught by means of a variety of instructional
methods, such as the whole-word approach, contextual cues, phonics, and easy
book reading (Elbro, 2006).
We addressed the following research questions:
1. How well does early vocabulary predict later educational outcomes, including
language, the closely related skills of literacy, and the most distantly related skills
of mathematics?
2. What is the role of SES? (a) Does prediction from early vocabulary actually
reflect the influence of SES and (b) do SES differences moderate the prediction
so that the correlations vary by SES levels?
3. How do those predictions vary with the age and gender of the child at the initial
4. Is there a nonlinearity in sensitivity, that is, regions where small CDI differences
make larger differences in prediction than in other regions?
5. What are the educational outcomes of the delayed group, and how do they com-
pare with the outcomes for the median 10% and the top 10% of early vocabulary?
Children in the current study were part of the norming study of the Danish version
of the CDI (Bleses et al., 2008), conducted in 2002 and 2003. Inclusion crite-
ria were Danish citizenship; born in Denmark; attaining the exact age of 8, 9,
10,...,or30monthsinthefirstweekofJune2002; and no reported speech,
hearing, or other serious (chronic) health problems. Parents of 3,714 children be-
tween 16 months and 36 months, all native speakers of Danish, completed the CDI
report and a background questionnaire addressing child and parent characteristics.
Applied Psycholinguistics 6
Bleses et al.: Early vocabulary and later achievement in school
The sample was balanced with respect to gender (51% girls, 49% boys), but the
educational and occupational level of the parents was somewhat higher than the
national average (Bleses et al., 2008). For this paper, we restricted the sample to
children between 16 and 30 months (N=2,863), both for comparability to other
CDI projects, which generally target that age range, and because the sample was
smaller at the ages above 30 months.
The Danish Ministry of Education has made National Danish test data avail-
able for researchers. In the CDI norming study, we had not registered the civil
registration number (CPR number) that all children are assigned at birth by the
Central Office of Civil Registration. However, by using child and parent names,
with the help of the Central Office of Civil Registration, via CPR numbers, we
were able to link the original CDI data for 2,120 children to Danish National Test
scores in reading and math. Even though there is some attrition in the National
Test data set (between 2.4% and 3.4% for the National Test in reading and math;
cf. Beuchert & Nandrup, 2014), the attrition in our study was due to insufficient
information on child and parent name, which made it impossible to match child
name with the CPR. The resulting sample is balanced with respect to gender (51%
girls and 49% boys), and the imbalance of SES measures of the original sample is
maintained. Compared to all Danish children 1–3 years of age based on a sample
obtained from Statistics Denmark in 2003, our sample includes a lower percentage
of parents having obtained only a basic education (primary school or high school;
4% vs. 28%) or a short vocational education (e.g., electrician; 45% vs. 52%).
Conversely, a higher percentage of parents in our sample had a medium education
(e.g., teacher; 30% vs. 11%) or a high education such as an advanced university
education (21% vs. 9%).
Assessing early expressive vocabulary. Expressive vocabulary skills were as-
sessed using the Danish adaptation (Bleses et al., 2008) of the CDI (Fenson
et al., 2007). The Danish CDI: Words and Sentences, includes a 725-word vo-
cabulary checklist, which is organized into 22 semantic categories such as sound
effects and animal sounds, nouns (animals, vehicles, and toys), verbs, function
words, and so on. Parents mark which words they have heard their child produce.
The vocabulary checklist was used to generate a measure of the total number
of early words produced by a child. The reliability and validity of the origi-
nal CDI has been demonstrated in numerous studies, summarized in Fenson
et al. (2007) and Law and Roy (2008). Measures of internal consistency and
test–retest reliability for the CDI: Words and Sentences are generally above 0.9,
and measures of concurrent validity with examiner-administered measures are
generally between 0.7 and 0.9. For the validity of the Danish CDI, see Bleses
et al. (2008). Consistent with most other research on late talkers (Rescorla & Dale,
2013), we have defined early delay as the child’s vocabulary falling within the
lowest 10% of children at that age.
Assessing reading and math skills. The Danish National Tests, administered at
the end of the school year, are web based (i.e., students take the test online at a
Applied Psycholinguistics 7
Bleses et al.: Early vocabulary and later achievement in school
computer), objective (i.e., the system chooses the items and calculates the results),
and adaptive (i.e., based on a Rasch model, each student is presented with a se-
quence of items of different difficulty levels based on performance on the previous
items where the difficulty level of the items and the pupil ability is measured on the
same logit scale). The items are multiple-choice questions with a varying number
of options. The Danish National Tests in reading evaluate students’ ability within
three areas: language comprehension (semantics of individual words, homonyms,
language use, and idioms), decoding (word identification in concatenated words
and word reading), and reading comprehension (comprehension of written texts).
All tasks within the three areas require some text reading, but for language com-
prehension, a picture task without text was used as well. The Danish National
Tests in math evaluates students’ ability in numbers and algebra, geometry, and
mathematics in use, but we analyzed only the numbers and algebra score in the
present paper because these are symbolic systems, which seem more comparable
to language than geometry or mathematics in use. This test evaluates students’
knowledge of integers, decimals, and fractions, and their skills in arithmetic,
percentage calculations, and ability to read graphs.
The Danish National Tests provide a valid estimate of student abilities (Beuchert
& Nandrup, 2014). For example, lower scores are associated with low birth weight,
special education needs, and lower SES; all together, student and parental back-
ground explain approximately 13%–21% of the variance in performance. Ninth-
grade school examinations provided an external measure of validity; when they
were regressed on the same-subject National Test data, the earlier national test
results explained 48%–49% of the Danish and math examination marks (Beuchert
& Nandrup, 2014).
Analytic strategy
Multiple regression analysis was used to determine the prediction from the early
vocabulary measure, together with age, gender, and SES, to the four educational
outcomes of language comprehension, decoding, reading comprehension, and
numbers and algebra, taken from the national sixth-grade test. SES was mea-
sured by grouping children into four categories based on their parents’ educa-
tional status as described earlier (see Participants). The regression analysis de-
termined the effect of age, gender, and SES on mean levels of performance, as
well as evaluating the unique variance contributed by vocabulary and SES. To
address the question of whether prediction was more accurate for some subgroups
that others, supplementary correlations were calculated by age, gender, and SES
To explore the nature and possible nonlinearity of the vocabulary-outcome
measure relationships, we obtained quantile curves in the following way. For each
value on a grid between –2.1 and 2.1 with a spacing of 0.025, we considered the
500 nearest neighbors, that is, the 500 children closest with their CDI score to the
given value. Within these 500 children, we determined the empirical 10%, 25%,
50%, 75%, and 90% quantile of the results from the National Test. For each of the
five quantile levels, we obtained in this way a curve of quantile values versus CDI
score values. This curve was smoothed by a Lowess smoother with a bandwidth of
Applied Psycholinguistics 8
Bleses et al.: Early vocabulary and later achievement in school
Tab le 1. Hierarchical regression with age and gender in Step 1, CDI vocabulary
or SES in Step 2, and age, gender, SES, CDI in Step 3 for the four National Test
Step 1
Age & Step 2 Step 2 Step 3
Gender CDI SES All
Danish tests
comprehension .072 .005 .217 .047 .221 .049 .297 .088
Decoding .145 .021 .256 .065 .252 .063 .324 .105
comprehension .109 .012 .222 .049 .275 .076 .332 .110
Math test
Numbers & algebra .019 .000 .138 .019 .237 .056 .270 .073
Note: CDI, MacArthur–Bates Communicative Development Inventories; SES, so-
cioeconomic status; R, standardized regression coefficient.
0.2. In the corresponding scatterplots, extreme values of the National Test results
were set to the upper and lower boundaries of the range shown on the y-axis and
extreme values of the zscore were set to –3 or 3, respectively. The middle (median)
curves in each figure are the most revealing of the relationship.
To provide a visual comparison of the distribution of outcomes on each of the
sixth-grade language/literacy measures for children with early vocabulary delay
(bottom 10%, N=209) with those for children in the median 10% (45th–55th
percentile, N=215) and the top 10% (N=212), histograms for the three groups
on that measure were superimposed.
To examine the first research question, how well early vocabulary predicts lan-
guage, literacy, and mathematical outcomes, multiple regression analyses were
performed, and summarized in Table 1. When CDI vocabulary was added as a
predictor, it accounted for an additional 4.2%, 4.4%, and 3.7% of the variance in
the language and literacy measures, but only 1.9% of the mathematics measure.
When SES was added prior to CDI vocabulary, the figures were comparable, but
somewhat higher for reading comprehension and mathematics. When both were
added, the total variance accounted for was nearly as large as the sum of the two
individual variances described above. These results, suggesting largely unique
effects for vocabulary and SES are consistent with the findings (not included
in the table) that SES is almost completely uncorrelated with CDI vocabulary
(r=.032), although it is correlated with the later measures: with language com-
prehension (r=.209), with decoding (r=.205), with reading comprehension
(r=.252), and with numbers and algebra (r=.236).
Applied Psycholinguistics 9
Bleses et al.: Early vocabulary and later achievement in school
Tab l e 2. Longitudinal prediction of r (N) values for four National Test scores by CDI
vocabulary for selected subgroups
Danish Language/Literacy Tests
Language Reading Numbers &
Subgroup Comp. Decoding Comp. Algebra
Age (months)
16–20 .186** (632) .162** (632) .129** (632) .114** (635)
21–25 .188** (707) .246** (707) .200** (632) .120** (703)
26–30 .242** (646) .226** (646) .249** (646) .180** (650)
Girls .185** (1025) .198** (1025) .182** (1025) .149** (1028)
Boys .229** (960) .229** (960) .209** (960) .124** (960)
SES (par. educ.)
Basic .320** (85) .203 (85) .376** (85) .149 (83)
Shorter .190** (896) .202** (896) .188** (896) .116** (896)
Medium .236** (590) .238** (590) .205** (590) .178** (593)
High .156** (414) .183** (414) .140** (414) .094 (416)
Note: CDI, MacArthur–Bates Communicative Development Inventories; SES, socio-
economic status.
To address the second part of research question two as well as question three,
supplementary correlation analyses were conducted. To explore potential changes
in the degree of prediction as children grow from 16 to 30 months, correlations
were computed within each of three age ranges: 16–20, 21–25, and 26–30 months
(see Table 2). The table also includes correlations separately for boys and girls, and
stratified by the four SES categories. The correlations generally grow modestly
across the age group, with the single exception of decoding between 21 and 25
months and 26 and 30 months. For the language and literacy measures, but not
the mathematics measure, the correlations are higher for boys than for girls. With
respect to SES, the lowest correlations were observed in families with high parental
education, but there was little other pattern.
To investigate the fourth research question, whether the predictive discrimina-
tion of small differences with the level of CDI vocabulary score, we inspected
smoothed quantile curves of the National Test outcomes in language and literacy
in dependent on the zscore (Figure 1). We restricted this analysis to children of 21
months and above, due to the very weak (“trivial” in Cohen’s system) prediction
from vocabulary below this age.
The curves, especially that for Language comprehension, appear to change in
slope over the range of the predictor. However, the apparently flat slopes at the
extremes of the curves are based on small numbers of subjects. Spline regression
analyses were performed to determine if there were inflection points (“knots”)
within the main body of the data (±2SD) where the slope of a linear regression
line changed significantly. For Language comprehension, there was a significant
decrease in the slope at approximately z=0.01, showing greater sensitivity in
Figure 1. Scatterplots of three sixth-grade National Test results versus CDI zscores obtained at 21–30 months and smoothed 10%,
25%, 50%, 75%, and 90% quantile curves.
Applied Psycholinguistics 11
Bleses et al.: Early vocabulary and later achievement in school
the below-average range than in the above-average range. For both decoding and
reading comprehension, no such inflection points were detected.
Finally, examining the fifth research question, concerning outcomes for the
delayed group (lowest 10%) in comparison to outcomes for a median and top
group, Figure 2 presents histograms for the three groups for each of the language
and literacy measures.
Consistent with the correlations, children in the lowest group were more likely
to show low scores later, while verbally precocious children (top 10%) were more
likely to show high scores later. The results for the median 10% show a more
uniform distribution across the range.
Children in the early delay group had an elevated chance (22%, 19%) of remain-
ing in the lowest 10% later on language comprehension and decoding, respectively,
though not for reading comprehension (10.5%). However, a subgroup of these chil-
dren showed good development later; 40%, 34%, and 36% were in the upper half
on Language comprehension, decoding, and reading comprehension, a result that
is consistent with much other research on outcomes of very early language delay
(Rescorla & Dale, 2013). Examination of the prediction for children in the top
group shows that obtaining a very high vocabulary score early does not necessarily
imply high performance in school later. The probability of remaining in the initial
high category is even lower than for children with early delay remaining in the
lowest group: 7.5%, 8%, and 5%, respectively. However, it is very unlikely that
these children will experience problems and end up in the bottom 10%: 3.5%, 5%,
and 1.5% for the three measures, respectively.
To address the first research question of how well early vocabulary predicts later
educational outcomes, we conducted regression and correlation analyses predict-
ing vocabulary scores at 16–30 months of age with sixth-grade outcomes in Danish
language and reading as well as math. The r2values from the regression and the
simple correlations both indicate a very modest degree of prediction, at best reach-
ing the level of “weak” in terms of Cohen guidelines. The interval between early
vocabulary assessment and outcome measures constituted more than three quar-
ters of the children’s lives, and undoubtedly comprised a host of family, school,
health, and other sources of variability. As expected, the prediction was stronger
for language and literacy outcomes than for the math measure. To our knowledge,
this is the first documentation of academic achievement prediction from language
measured at this young in a large and diverse sample. It is also notable that the
prediction is significant across the board for both boys and girls (cf. Hohm et al.,
It is also illuminating to consider these results in the context of other research
on early language and emergent literacy measures and their prediction to later
academic outcomes. A comprehensive meta-analysis was conducted as part of
the report of the National Early Literacy Panel (2008; see chap. 2). Consid-
erably stronger predictions, in the range of 0.4 to 0.6, were found for some
measures, particularly aspects of print awareness and phonological awareness.
Measures of oral language yielded weaker correlations, typically around .3–.35.
Applied Psycholinguistics 12
Bleses et al.: Early vocabulary and later achievement in school
Figure 2. Charts showing the percentage of children in deciles 1–10 on the Danish National
Test in sixth grade for three groups of children differing in CDI score as 16- to 30-month-olds,
namely, bottom 10%, median 10% (45%–55%), and top 10% on the CDI.
Applied Psycholinguistics 13
Bleses et al.: Early vocabulary and later achievement in school
Furthermore, within the domain of oral language, vocabulary was less predic-
tive than other measures such as grammar. Lonigan and Shanahan (2010) argue
that these findings show that oral language is necessary for literacy, but far from
sufficient. It should also be kept in mind that the overwhelming majority of the
studies summarized by the National Early Literacy Panel included children 4
years of age and older. Though there was little evidence that the predictions
from 4-year-olds and those from kindergartners were different, it is likely that
prediction from 2-year-olds would be different. The correlations obtained here
(.13 to .25) are similar to those obtained in more recent research in English fol-
lowing the National Early Literacy Panel report, and discussed earlier by Lee
(2011), Harlaar et al. Plomin (2008), and Hayiou-Thomas et al. (2012), even
though we examine a different language and consider a longer time span. These
results thus also suggest that the relatively smaller SES variability in Denmark
compared to the United States and Danish children’s almost 100% daycare atten-
dance is not enough to eliminate long-term effects of variability in early language
What is the theoretical and applied significance of these correlations? From a
theoretical perspective, it is important to acknowledge that these correlations are
not necessarily evidence of causality. Early vocabulary differences could causally
initiate a cascade of differences in grammar, pragmatics, metalinguistic awareness,
literacy, and intellectual development. However, early vocabulary differences and
later academic differences could be independent manifestations of underlying
abilities or of general neuropsychological functioning (see, e.g., Scarborough,
From an applied perspective, these weak correlations are not strong enough
to justify clinical decision making on their own. At present, late talking in
children below 2 years 6 months is not sufficient in itself to justify speech–
language therapy (Rescorla & Dale, 2013). However, based on promising results
of early low-cost interventions in homes (e.g., van Steensel, McElvany, Kurvers,
& Herppich, 2010) and by increasing the instructional quality in daycares (e.g.,
Keys et al., 2013; National Institute of Child Health and Human Development
Early Child Care Research Network, (2005), the results do warrant that late talk-
ers (in particular if late talking is combined with other indices of increased risk,
such as family history of language/literacy learning difficulties, low SES, and
poor comprehension) receive some kind of increased educational support in their
homes or in daycares.
A further perspective on the magnitude of these predictive correlations, weak
but significant given the large sample size, is provided by a comparison with other
variables that are related to educational outcomes at this stage of development,
particularly reading. Our observed correlations in the range of .13 to .25 are
comparable in size, based on percentage of variance accounted for, to a Cohen
deffect size measure for an intervention of 0.26 to 0.52, that is, the small to
moderate range. Utilizing summary statistics provided by Lipsey et al. (2012)
based on American studies, we can say that this effect size is comparable to 1
year’s mean growth in reading scores (Grades 5–6, d=0.32; Grades 6–7, d=
0.23). This is somewhat smaller than the eligible–ineligible for free/reduced lunch
difference on the National Assessment of Educational Progress scores (d=–0.45
Applied Psycholinguistics 14
Bleses et al.: Early vocabulary and later achievement in school
to –.74), and smaller still than ethnic differences (Black/Hispanic/White) on the
National Assessment of Educational Progress (d=0.53 to –.83). These statistics
are a reminder of how little we can explain of individual differences in reading
overall at present.
For our second research question, we wanted to investigate the role of SES in
predicting later educational outcomes. Three aspects of the results suggest that
SES and CDI vocabulary play largely independent roles. CDI vocabulary itself
is not significantly correlated with SES. Adding SES as a predictor (Table 1)
after CDI vocabulary had nearly as large an effect on the prediction as it had
when it was entered prior to vocabulary. Finally, examination of prediction from
vocabulary within the four SES categories showed little variation. Together, these
results indicate that early vocabulary is an SES-independent predictor. However,
the generalizability of this conclusion is unclear because it rests upon the lack of a
correlation between SES and early CDI vocabulary scores. In another Danish study
with a more representative sample in terms of parental education, substantial SES-
related differences in child outcomes were observed (Bleses, Højen, Jørgensen,
Jensen, & Vach, 2010); that is, the lack of SES differences in the current study
could be due to the biased nature of the sample.
The third research question concerned how these predictions vary with the age at
first measurement and gender. There is a growing consensus that early vocabulary
measures are credible sources of information of child language development. Our
results show that a vocabulary measure taken as early as the second year of life is
a significant predictor of language and literacy and, less so, of math skills. With
respect to gender, we found higher correlations for boys than for girls for all three
Danish language outcome measures. We had made no predictions at this question,
and the result is opposite to that of Hohm et al. (2007); further research is needed
to clarify the reasons for this difference.
Examination of scatter plots and smoothed quantile curves investigating the
fourth research question suggested the lower range of zscores is more sensitive
with respect to long-term prediction, but spline regression analysis confirmed this
statistically only for language comprehension.
Finally, addressing the fifth research question, we examined the outcomes of
early delay. This is one of the very first studies of the long-term academic outcomes
of a large and diverse sample of late talkers. In contrast to the findings of Rescorla
(2013), children with an early delay were more likely to continue to have problems
(overrepresentation in both the bottom 10% and the bottom half of the distribution),
although it was also true that about a third of the late talkers moved into the upper
half of the distribution. Precocious children were not as likely to remain in the top
10% group; however, their chances of ending up with reading problems were low.
Limitations and future directions for research
All the children in this study were monolingual Danish speakers at the time of
initial assessment. As societies become increasingly multilingual, there is a strong
need for similar kinds of longitudinal information for children who are growing
up in a bilingual context. For these children, it is even more challenging to make
clinical decisions about language impairment or risk of language impairment.
Applied Psycholinguistics 15
Bleses et al.: Early vocabulary and later achievement in school
The main finding of this study is that educational outcomes can be predicted from
a very early (below 2 years) measure of language development, namely, the size
of the parent-reported productive vocabulary with effect sizes (in proportion of
variance accounted for) comparable to 1 year’s mean growth in reading scores.
However, there is great variability in outcome. Supplementation with other rel-
evant information, such as family history of learning difficulties, low SES, and
poor comprehension, may somewhat improve the prediction. Intervention studies,
however, have demonstrated that children’s language development can be acceler-
ated by parent-based interventions in the toddler and early preschool range or by
increasing the instructional quality of daycares. Such low-cost, and ideally low-
stress, interventions may be highly appropriate responses to late talking. However,
more basic research on the mechanism by which early vocabulary differences re-
sult in later academic differences is needed. Understanding which associations are
truly causal versus reflecting common underlying factors is essential for selecting
optimal targets for intervention.
We extend warm thanks to parents for their contribution to the original CDI study. The
MBCDI study was funded by a grant from the Carlsberg Foundation and the Research
Council for the Humanities.
Beuchert, L. V., & Nandrup, A. B. (2014). The Danish National Tests—A practical guide. Aarhus,
Denmark: Aarhus University, Business and Social Sciences Department of Economics and
Bleses, D., Højen, A., Jørgensen, R. N., Jensen, K. Ø., & Vach, W. (2010). Sprogvurdering af 3-
arige—Karakteristika og risikofaktorer. Working papers, Center for Child Language, e-prints,
Bleses, D., Vach, W., Slott, M., Wehberg, S., Thomsen, P., Madsen, T., et al. (2008). The Danish
Communicative Development Inventories: Validity and main developmental trends. Journal of
Child Language, 35, 651–669.
Bornstein, M. H., Hahn, C. S., Putnick, D. L., & Suwalsky, J. T. (2014). Stability of core language
skill from early childhood to adolescence: A latent variable approach. Child Development, 85,
Dale, P. S., McMillan, A. J., Hayiou-Thomas, M. E., & Plomin, R. (2014). Illusory recovery: Are
recovered children with early language delay at continuing elevated risk? American Journal of
Speech–Language Pathology, 23, 437–447.
Dale, P. S., Penfold, M. J., & Fenson, L. (2011). Adaptations of the MacArthur–Bates Communicative
Development Inventories into other languages: A 2011 update. Paper presented at the 12th
International Congress for the Study of Child Language, July, Montreal.
Elbro, C. (2006). Literacy acquisition in Danish: A deep orthography in cross-linguistic light. In
R. Malatesha Joshi & P. G. Aaron (Eds.), Handbook of orthography and literacy (pp. 31–45).
Mahwah, NJ: Erlbaum.
Eriksson, M., Marschik, P. B., Tulviste, T., Almgren, M., Pereira, M. P., Wehberg, S., et al. (2012).
Differences between girls and boys in emerging language skills: Evidence from 10 language
communities. British Journal of Developmental Psychology, 30, 326–343.
Applied Psycholinguistics 16
Bleses et al.: Early vocabulary and later achievement in school
Fenson, L., Marchman, V. A., Thal, D. J., Dale, P. S., Reznick, J. S., & Bates, E. (2007). MacArthur–
Bates Communicative Development Inventories: Users guide and technical manual (2nd ed.).
Baltimore, MD: Brookes.
Harlaar, N., Hayiou-Thomas, M. E., Dale, P. S., & Plomin, R. (2008). Why do preschool language
abilities correlate with later reading? A twin study. Journal of Speech, Language, and Hearing
Research, 51, 688–705.
Hayiou-Thomas, M. E., Dale, P. S., & Plomin, R. (2012). The etiology of variation in language
skills changes with development: A longitudinal twin study of language from 2 to 12 years.
Developmental Science, 15, 1–17.
Hohm, E., Jennen-Steinmetz, C., Schmidt, M. H., & Laucht, M. (2007). Language development at ten
months. European Child & Adolescent Psychiatry, 16, 149–156.
Keys, T. D., Farkas, G., Burchinal, M. R., Duncan, G. J., Vandell, D. L., Li, W., et al. (2013). Preschool
center quality and school readiness: Quality effects and variation by demographic and child
characteristics. Child Development, 84, 1171–1190.
Law, J., & Roy, P. (2008). Parental report of infant language skills: A review of the development
and application of the communicative development inventories. Child and Adolescent Mental
Health, 13, 198–206.
Lee, J. (2011). Size matters: Early vocabulary as a predictor of language and literacy competence.
Applied Psycholinguistics, 32, 69–92.
Lipsey, M. W., Puzio, K., Yun, C., Hebert, M. A., Steinka-Fry, K., Cole, M. W., et al. (2012).
Translating the statistical representation of the effects of education interventions into more
readily interpretable forms. New York: National Center for Special Education Research.
Lonigan, C. J., & Shanahan, T. (2010). Developing early literacy skills: Things we know we know and
things we know we don’t. Educational Researcher, 39, 7.
National Early Literacy Panel. (2008). Developing early literacy: A scientific synthesis of early literacy
development and implications for intervention. New York: National Institute for Literacy.
National Institute of Child Health and Human Development Early Child Care Research Network.
(2005). Pathways to reading: The role of oral language in the transition to reading. Develop-
mental Psychology, 41, 428–442.
OECD. (2014). Society at a glance 2014: OECD social indicators. Paris: Author. Retrieved from
Rescorla, L. A. (1989). The Language Development Survey: A screening tool for delayed language in
toddlers. Journal of Speech and Hearing Disorders, 54, 587–599.
Rescorla, L. A. (2013). Late-talking toddlers: A 15-year follow-up. In L. A. Rescorla & P. S. Dale
(Eds.), Late talkers: Language development, interventions and outcomes (pp. 219–239). Balti-
more, MD: Brookes.
Rescorla, L. A., & Dale, P. S. (Eds.) (2013). Late talkers: Language development, interventions, and
outcomes. Baltimore, MD: Brookes.
Scarborough, H. S. (2005). Developmental relationships between language and reading: Reconsil-
ing a beautiful hypothesis with some ugly facts. In H. W. Catts & A. G. Kamhi (Eds.), The
connections between language and reading disabilities (pp. 3–24). Mahwah, NJ: Erlbaum.
van Steensel, R., McElvany, N., Kurvers, J., & Herppich, S. (2011). How effective are family literacy
programs? Results of a meta-analysis. Review of Educational Research, 81, 69–96.
... Vocabulary size (Bleses et al., 2016;Friend et al., 2012;Friend et al., 2018;Friend et al., 2019;Reilly et al., 2010) and speed of word processing (Fernald & Marchman, 2012; in toddlerhood are associated with language and cognitive development later in childhood. Despite many papers demonstrating this link, there has been little research into the mechanisms underlying this relation and the conditions under which it emerges. ...
... Evidence for the predictive value of vocabulary knowledge comes from larger studies using parent report and smaller studies using directly assessed vocabulary. Parent-reported vocabulary between 16 and 30 months of age predicts language, reading, and math achievement between 3 and 12 years of age (Bleses et al., 2016;Duff et al., 2015;Lee, 2011;Morgan et al., 2015). Late-talker status (below the 10th percentile in expressive vocabulary) at age two predicts language outcomes at age four beyond other risk factors (Reilly et al., 2010). ...
... The impetus for this work was threefold. First, there is a body of evidence indicating that both vocabulary and speed of processing significantly predict downstream language and cognition and are therefore foundational skills for later learning (Bleses et al., 2016;Fernald & Marchman, 2012;Friend et al., 2012;Friend et al., 2018;Friend et al., 2019;Morgan et al., 2015). However, there has been less work investigating the conditions under which this relation emerges. ...
Full-text available
Toddler vocabulary knowledge and speed of word processing are associated with downstream language and cognition. Here, we investigate whether these associations differ across measures. At age two, 101 participants (55 monolingual French-speaking and 46 monolingual English-speaking children) completed a two-alternative forced choice task, yielding measures of decontextualized vocabulary (number of correct responses) and haptic speed of word processing (latency of correct responses). At ages three, four, and five children completed a battery of language assessments and an executive function task. Growth curve models revealed that age-two vocabulary significantly predicted age-three performance (but not growth from age three to four or four to five) across all language assessments but speed of processing did not predict language outcomes in final models. Finally, speed of processing was correlated with executive function at age three whereas vocabulary was not. Results suggest that vocabulary is associated with a range of downstream language abilities whereas haptic speed of processing may be associated with executive control.
... Moreover, the few studies that do report prediction of later academic outcomes are based on correlations across the full range; for example, vocabulary at 24 months predicted vocabulary scores in Grades 1, 3, and 5 (r = .25-.27; Lee, 2011). Bleses et al. (2016) examined the prediction from socioeconomic status (SES; parental education and occupation) and early expressive vocabulary to Danish National Test scores for language, reading, and mathematics at the age of 12 years. The total R 2 values for language and reading ranged from 8.8% to 11.0% but for math was only 7.3%. ...
... Typically studies end by age 11 or 12 years, although they still find significant predictions. Bleses et al. (2016) is one of these. As a second example, in the context of a twin study, Hayiou-Thomas et al. (2012) report significant phenotypic correlations from an age 2 parent report measure adapted from the MacArthur-Bates Communicative Development Inventories (CDI) to four webadministered language measures at age 12. Finally, Bornstein et al. (2016) administered multiple tests of language at 15 and 25 months and 5 and 11 years. ...
... The outcome measures for the present study were scores on the Danish Upper Secondary School Leaving Exam (below referred to as USSLE) in four academic domains taken in the final year of compulsory education, Grade 9 (approximately age 15), which includes Danish, English, Mathematics, and Science subtests. The study includes several important extensions beyond the previous report (Bleses et al., 2016). First, the outcomes are measured approximately 3 years later, at the last possible assessment point at which a relatively population-representative sample is available. ...
Prediction from early development to later achievement has the potential to improve clinical and educational service delivery as well as to inform developmental theory. In this longitudinal study, we asked how well can educational achievement measured in the final year (Grade 9, age 15) of compulsory education—both overall and for outcomes in the lowest 20%—be predicted from information available in the first 3 years of life, particularly early expressive vocabulary? Measures for 2,767 children (1,345 males, 1,422 females) aged 16 to 30 months on early expressive vocabulary, along with family socioeconomic status (parental education, occupation, and household income), other demographic information (gender, birth order, parental age, social benefits, etc.), timing and nature of early child care, and early home literacy experience, were used to predict performance on Danish Upper Secondary School Leaving Exam (USSLE) in Danish, English, Math, and Science. A cross-validated combination of Lasso (Least absolute shrinkage and selection operator) and ordinary least squares regression was the primary analysis for continuous outcomes and cross-validated Lasso and logistic regression for categorical outcomes. With respect to continuous outcome measures, the patterns of prediction varied with specific domain; R ² ranged from 9.4% to 21.4%. With respect to low USSLE performance, area under the curve statistics ranged from 64.1% to 72.2%. In all domains, early childhood expressive vocabulary made a significant unique contribution to the outcome when measured over the full range. The prediction was also significant for vocabulary to low Danish and English scores although not for Math and Science. Although the predictions were not strong enough for clinical diagnosis on their own, they demonstrate that low early vocabulary is an important and measurable risk condition that can direct early intervention and thus contribute to later educational attainment.
... They were developed by the authors of this Element and their colleagues for use in early childhood education and care (ECEC) centers, to be administered by the children's usual ECEC educators. The background for this governmental commission was a heightened awareness of (1) the great variation in language development already apparent by the end of infancy (Fenson et al., 2007), (2) the association between early language development and later educational achievement (Bleses et al., 2016) and, further downstream, many other life outcomes, and (3) the realization that the provision of publicly subsidized early childhood care from about age 1 along with subsidized public ECEC centers, as well as free regular schools later, did little to close the achievement gap between advantaged and disadvantaged children. ...
... Drawing on both types of functioning, language can serve as a crucial tool for joint efforts and other forms of cooperation. The oral language skills developed in early childhood are also an essential foundation for literacy (Bleses et al., 2016;Harlaar et al., 2008), which will extend all the just-mentioned functions. ...
... All of them are foundational for literacy development, but lexicon and pragmatics are especially important both theoretically and for assessment. There is a large body of evidence that early vocabulary size is a predictor of later literacy (Bleses et al., 2016;. One reason for the prediction is that for early readers, it is necessary to know a word in order to be able to read it; another is that the knowledge of other words in a sentence can facilitate the reading of each individual word and eventually learning the meaning of new words. ...
This Element has two main purposes. Firstly, it discusses purposes, advantages, and disadvantages as well as the challenges of different formats of language assessment, concluding with a focus on educator-administered language assessment in early childhood and education programs. It addresses the selection of assessment domains, the trade-off between brevity and precision, the challenge of assessing bilinguals, and accommodating the requirements of funders (e.g., government agencies) and users (e.g., educators and schools). It draws on lessons learned from developing two instruments for a national Danish-language and preliteracy assessment program. Secondly, it introduces those two educator-administered instruments-Language Assessment 3-6 (LA 3-6) and Language Assessment 2-year-olds (LA 2)-with respect to content, norming, gender and socioeconomic influences as well as psychometric qualities. The intention is that this experience can help enable the extension of the educator-based approach to other languages and contexts, while simultaneously acknowledging that linguistic and cultural adaptations are crucial.
... One particularly important component of language learning is children's vocabulary. This shared symbolic system is critical for children's development, contributing to later cognitive (Kuhn et al., 2014;Müller et al., 2009), socioemotional (Armstrong et al., 2017Longobardi et al., 2015;Longoria et al., 2009;Morgan et al., 2015), and academic outcomes (Bleses et al., 2016;Masrai & Milton, 2017;Roche & Harrington, 2013;Townsend et al., 2012). Indeed, vocabulary knowledge is critical for academic success in several domains. ...
... For the sake of this review, we would like to highlight that vocabulary size predicts overall academic achievement (e.g., degree level obtained; Milton & Treffers-Daller, 2013) and academic achievement within specific domains (e.g., mathematics; Purpura et al., 2017). This pattern is evident longitudinally and starting at an early age; children's productive vocabulary in infancy and toddlerhood predicts their 6th grade scores on standardized tests of language, literacy, and mathematics (Bleses et al., 2016). Low productive vocabulary scores had the most predictive power, indicating that infants and toddlers with language delays may be especially at risk of struggling once they enter the classroom (Bleses et al., 2016). ...
... This pattern is evident longitudinally and starting at an early age; children's productive vocabulary in infancy and toddlerhood predicts their 6th grade scores on standardized tests of language, literacy, and mathematics (Bleses et al., 2016). Low productive vocabulary scores had the most predictive power, indicating that infants and toddlers with language delays may be especially at risk of struggling once they enter the classroom (Bleses et al., 2016). Furthermore, the same pattern exists throughout the college years. ...
Researchers have historically focused on characterizing vocabulary development in infants and toddlers. However, less is known about the vocabulary composition of children entering formal schooling. The authors propose that a critical next step in understanding school readiness is to characterize the academic vocabulary of children entering kindergarten. These assessments should identify knowledge of words used in general academic discourse and specific domains (e.g., science). This chapter outlines initial steps taken by the authors to identify children's science vocabulary around the age of school entry. Furthermore, the complexity of vocabulary assessment is illustrated via a discussion of vocabulary development in dual-language learners. Understanding the words that children can produce at various stages of development will help determine whether children are ready for school and inform interventions that target word knowledge. Indeed, focusing on children's vocabulary presents an exciting new opportunity to integrate developmental science with real-world educational settings.
... Numerous studies suggest that early linguistic challenges in the form of a language delay during the preschool years forms a significant risk factor for chronic language impairment with associated academic failure and learning disabilities (Scarborough and Dobrich, 1990;Nash and Donaldson, 2005;Wise et al., 2007;Desmarais et al., 2008;Rescorla, 2009;Petinou et al., 2011;Bleses et al., 2016). Late talkers might be at risk of developing a language disorder compared to typically developing children (Rescorla and Schwartz, 1990;Rescorla et al., 2000b;Parizi et al., 2013). ...
Full-text available
The aim of this study was to evaluate the psychometric properties of the adapted Cyprus Greek Lexical List a-CYLEX (GR) in a sample of 194 Greek toddlers from the island of Crete with Standard Modern Greek (SMG) as their primary language. The a-CYLEX (GR) is a parental report checklist for assessing the receptive and expressive vocabulary skills of children aged 12 months to 3:6 years. Concurrent validity of the instrument was tested via correlations with the adapted Greek version of the Receptive One-Word Picture Vocabulary Test-II (ROWPVT-II), which was administered to 124 SMG-speaking children between the ages of 2 and 3:6 years. Test–retest reliability was tested by administering the instrument two times within a 2-week interval to 59 parents (30.41% of the total sample). Statistical analyses provided strong evidence for the high internal consistency and test–retest reliability of the a-CYLEX (GR). The role of the demographic variables in vocabulary performance and the frequency of each a-CYLEX (GR) word category by age were also investigated. In conclusion, the a-CYLEX (GR) is a parental report checklist that can be used by clinicians who are interested in assessing receptive and expressive vocabulary of children during toddlerhood.
... Children learn how to solve problems and master their behavior through the internalization of adult modeled speech (Vygotsky, 1978). Several studies have demonstrated the importance of expressive vocabulary for children's later decoding, spelling, reading, mathematics, and executive function, predicting also fewer externalizing and internalizing problems (Bleses et al., 2016;Kuhn et al., 2014;National Early Literacy Panel, 2008). Moreover, toddlers with expressive vocabulary difficulties have a higher chance of falling behind their peers during adolescence on language related academic measures, such as vocabulary and grammar (Rescorla, 2005). ...
Full-text available
The interplay between self-regulation related skills and language is well recognized in dynamic theories, but few empirical studies have tested it, especially in toddlers. The current study examines the bidirectional links between self-regulation related skills and expressive vocabulary in a longitudinal study during toddlerhood. Participants were 268 toddlers (Mage = 29.6 months, SD = 4.2; 52% boys), mostly of Portuguese nationality, with medium to high sociocultural and economic status, attending private for-profit and nonprofit facilities in Portugal. Self-regulation (executive function and effortful control) and expressive vocabulary were assessed across three assessment waves. Results from cross-lagged panel models suggested bidirectional links between self-regulation and expressive vocabulary across the three assessment waves. These findings add to previous research by taking a first step into establishing the early onset of the intertwined development of these two foundational skills. (PsycInfo Database Record (c) 2022 APA, all rights reserved).
... In a recent meta-analysis performed by Snowling & Melby-Lervåg (2016) which examined the connection between oral language abilities and dyslexia, a strong connection was found between low vocabulary skills and reading difficulties. In addition, a longitudinal study which examined the contribution of vocabulary to decoding and comprehension abilities found that vocabulary size from the age of 16 months is a strong predictor of reading until the sixth grade (Bleses et al., 2016). This association has to do with the fact that meaning helps with decoding and also with the fact that children's vocabulary size is associated with their phonological knowledge (Lonigan et al., 2000). ...
Full-text available
Background Many studies have examined which kindergarteners’ skills best predict reading acquisition later at school. Most of these studies focused on emergent literacy skills such as letter knowledge, phonological awareness, and oral language abilities as the basis for reading acquisition. Additionally, several studies have also found cognitive skills such as memory, executive functions, and visual processing to be related to reading abilities. Although much research has been devoted to the connection between these two types of skills and reading, the relationship between the emergent literacy and cognitive skills has not been widely investigated. Objective The current study aimed to examine the contribution of different cognitive skills to emergent literacy and to explore the cognitive profile of children with low and high emergent literacy skills. Methods The study was conducted among 125 Hebrew-speaking kindergarten children. Cognitive measures including memory, speed of processing (SOP), executive functions, visual perception, and attention were collected, as well as literacy measures such as phonological awareness, orthographic knowledge, and vocabulary. Results The research findings indicated a very strong association between the cognitive measures and the literacy measures. Children with low emergent literacy skills exhibited lower cognitive abilities. In addition, a significant association was found between visual perception, rapid naming, inhibition, and emergent literacy. Conclusions These findings suggest that literacy knowledge associated with basic cognitive skills, which play an important role in its development.
The ability to rapidly recognize words and link them to referents is central to children’s early language development. This ability, often called word recognition in the developmental literature, is typically studied in the looking-while-listening paradigm, which measures infants’ fixation on a target object (vs. a distractor) after hearing a target label. We present a large-scale, open database of infant and toddler eye-tracking data from looking-while-listening tasks. The goal of this effort is to address theoretical and methodological challenges in measuring vocabulary development. We first present how we created the database, its features and structure, and associated tools for processing and accessing infant eye-tracking datasets. Using these tools, we then work through two illustrative examples to show how researchers can use Peekbank to interrogate theoretical and methodological questions about children’s developing word recognition ability.
Children's early life experiences of language and parenting are thought to have pervasive, long‐term influence on their cognitive and behavioural development. However, studies are scarce that collected naturalistic observations to broadly assess children's early life experiences and test their associations with developmental outcomes in middle childhood. Here, we used digital audio‐recorders to collect three full days of naturalistic observations from 107 British families with children (46 boys) aged 2–4 years, of whom 89 participated in a follow‐up assessment four years later when the children were 5–8 years old. We found that children's early life experiences of language and parenting were not significantly associated with their later language ability, academic performance and behavioural outcomes. We explore differences in methodology, sample characteristics and the role of developmental periods as possible explanations for the discrepancy in findings between the current and previous studies.
Full-text available
Background: Type-Token Ratio (TTR), given its relatively simple hand computation, is one of the few LSA measures calculated by clinicians in everyday practice. However, it has significant well-documented shortcomings; these include instability as a function of sample size, and absence of clear developmental profiles over early childhood. A variety of alternative measures of lexical diversity have been proposed; some, such as Number of Different Words/100 (NDW) can also be computed by hand. However, others, such as Vocabulary Diversity (VocD) and the Moving Average Type Token Ratio (MATTR) rely on complex resampling algorithms that cannot be conducted by hand. To date, no large-scale study of all four measures has evaluated how well any capture typical developmental trends over early childhood, or whether any reliably distinguish typical from atypical profiles of expressive child language ability. Materials and methods: We conducted linear and non-linear regression analyses for TTR, NDW, VocD, and MATTR scores for samples taken from 946 corpora from typically developing preschool children (ages 2-6 years), engaged in adult-child toy play, from the Child Language Data Exchange System (CHILDES). These were contrasted with 504 samples from children known to have delayed expressive language skills (total n = 1,454 samples). We also conducted a separate sub-analysis which examined possible contextual effects of sampling environment on lexical diversity. Results: Only VocD showed significantly different mean scores between the typically -developing children and delayed developing children group. Using TTR would actually misdiagnose typical children and miss children with known language impairment. However, computation of VocD as a function of toy interactions was significant and emerges as a further caution in use of lexical diversity as a valid proxy index of children's expressive vocabulary skill. Discussion: This large scale statistical comparison of computer-implemented algorithms for expressive lexical profiles in young children with traditional, hand-calculated measures showed that only VocD met criteria for evidence-based use in LSA. However, VocD was impacted by sample elicitation context, suggesting that non-linguistic factors, such as engagement with elicitation props, contaminate estimates of spoken lexical skill in young children. Implications and suggested directions are discussed.
Full-text available
Purpose To examine the later development of language and literacy of children who had delayed language at age 2 but were in the normal range at age 4. Method Longitudinal data were analyzed from 3,598 pairs of twins participating in the Twins Early Development Study (TEDS). Six hundred thirty-three twins (8.8%) were delayed at age 2 based on parent-reported expressive vocabulary, and of these, 373 (59.0%) were classified as recovered based on 4-year measures. Each recovered 4-year-old was matched on vocabulary, gender, and zygosity to another 4-year-old without a history of early delay. Results Although the recovered group was below the mean for the total TEDS sample on measures of language at ages 7 and 12, there were no significant differences between the recovered and matched groups. Within the recovered group, it was not possible to predict outcome at better than a chance level. Conclusions Children who appear to have recovered by age 4 from early delay are at modest risk for continuing difficulties, but this appears to be no higher than the risk for other 4-year-olds with equivalent scores, reflecting the continuing variability in longitudinal outcome after age 4. All children in the low normal range at age 4 merit continuing monitoring.
Full-text available
This meta-analysis examines the effects of family literacy programs on children’s literacy development. It analyzes the results of 30 recent effect studies (1990–2010), covering 47 samples, and distinguishes between effects in two domains: comprehension-related skills and code-related skills. A small but significant mean effect emerged (d = 0.18). There was only a minor difference between comprehension- and code-related effect measures (d = 0.22 vs. d = 0.17). Moderator analyses revealed no statistically significant effects of the program, sample, and study characteristics inferred from the reviewed publications. The results highlight the need for further research into how programs are carried out by parents and children, how program activities are incorporated into existing family literacy practices, and how program contents are transferred to parents.
Full-text available
What is the role of oral language in reading competence during the transition to school? Is oral language in preschool best conceptualized as vocabulary knowledge or as more comprehensive language including grammar, vocabulary, and semantics? These questions were examined longitudinally using 1,137 children from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development. Children were followed from age 3 through 3rd grade, and the results suggest that oral language conceptualized broadly plays both a direct and an indirect role in word recognition during the transition to school and serves as a better foundation for early reading skill than does vocabulary alone. Implications of these findings are discussed in terms of both theoretical models of early reading and practical implications for policy and assessment.
This paper reports data from four studies using the Language Development Survey (LDS), a vocabulary checklist designed for use as a screening tool for the identification of language delay in 2-year-old children. A survey completed by the parent in about 10 min, the LDS displayed excellent reliability as assessed by Cronbach's alpha and test-retest techniques. Total vocabulary score as reported on the LDS was highly correlated with performance on Bayley, Reynell, and Preschool Language Scale expressive vocabulary items. The LDS was found to have excellent sensitivity and specificity for the identification of language delay, with a criterion of fewer than 50 words or no word combinations at 2 years yielding very low false positive and false negative rates. Data from three of these studies demonstrate the utility of the LDS as a screening tool for children attending public and private pediatric practices. Prevalence data using the LDS are reported comparing three different severity cutoffs for more than 500 children in seven survey samples.
This four-wave prospective longitudinal study evaluated stability of language in 324 children from early childhood to adolescence. Structural equation modeling supported loadings of multiple age-appropriate multisource measures of child language on single-factor core language skills at 20 months and 4, 10, and 14 years. Large stability coefficients (standardized indirect effect = .46) were obtained between language latent variables from early childhood to adolescence even when accounting for child nonverbal intelligence and social competence and maternal verbal intelligence, education, speech, and social desirability. Stability coefficients were similar for girls and boys. Stability of core language skill was stronger from 4 to 10 to 14 years than from 20 months to 4 years, so early intervention to improve lagging language is recommended.
This article examines associations between observed quality in preschool center classrooms for approximately 6,250 three- to five-year-olds and their school readiness skills at kindergarten entry. Secondary analyses were conducted using data from four large-scale studies to estimate the effects of preschool center quality and interactions between quality and demographic characteristics and child entry skills and behaviors. Findings were summarized across studies using meta-analytic methods. Results indicate small, but statistically significant associations for preschool center quality main effects on language and mathematics outcomes with little evidence of moderation by demographic characteristics or child entry skills and behaviors. Preschool center quality was not reliably related to socioemotional outcomes. The authors discuss possible explanations for the small effect sizes and lack of differential effects.
This paper investigated the predictive ability of expressive vocabulary size and lexical composition at age 2 on later language and literacy skills from ages 3 through 11. Multivariate analysis of covariance was performed to compare 16 language and literacy outcomes between children with large expressive vocabulary size at 24 months (N = 1,073) and those with smaller expressive vocabulary size. Comparisons between large and small verb size groups as a measure of lexical composition were also conducted. Our findings indicate that, after controlling for gender, birth order, ethnicity and socioeconomic status, total vocabulary size at age 2 can significantly predict subsequent language and literacy achievement up to fifth grade. Moreover, vocabulary size is a better predictor of later language ability than lexical composition.