We use a longitudinal design to examine associations for 2,120 16-30 months old children between early expressive vocabulary and later reading and math outcomes in the 6th Grade based on a large and diverse sample of Danish children. Educational outcomes, in particular decoding and reading comprehension, can be predicted from an early vocabulary measure as early as 16 months with effect sizes (in proportion of variance accounted for) comparable to one year’s mean growth in reading scores. The findings confirm in a relatively large population based study that late talkers are at risk for later educational attainment as the majority of children experiencing early language delay obtain scores below average in measures of reading in the 6th Grade. Low scores have the greatest predictive power indicating that children with early delays have elevated risk for later reading problems.
Early productive vocabulary predicts
academic achievement 10 years later
Aarhus University
University of Southern Denmark
University of New Mexico
Aarhus University
University of Southern Denmark
Received: February 18, 2015 Accepted for publication: December 24, 2015
Dorthe Bleses, TrygFonden's Centre for Child Research and School of Communication and Culture, Aarhus University
Fuglesangs All´
e 4, Building 2630, Aarhus 8210, Denmark. E-mail:
The development of reliable and valid parent-report measures of child language
has facilitated many advances in basic research and clinical practice. The two
instruments of this type that are most widely used are Rescorla’s Language Devel-
opment Survey (Rescorla, 1989,2013) and the MacArthur–Bates Communicative
Development Inventories (CDI; Fenson et al., 2007). They take advantage of the
much greater evidence available to parents and the fact that it is more representative
of the child’s full experience and language competence. Both of these instruments,
but especially the CDI, have been adapted to numerous languages as diverse as
Spanish, Danish, Russian, Chinese, American Sign Language, and others (Dale,
Penfold, & Fenson, 2011;see
These measures have many potential clinical and research uses, such as screen-
ing for early delay, identifying children with unusual profiles, and examination of
the influence of other biological and behavioral variables on language develop-
ment (Fenson et al., 2007). Almost all of the potential uses of these early language
measures rest on the assumption that individual differences have a long-term sig-
nificance. For that reason, they can help identify children who are at elevated risk
later in life and therefore merit early intervention, or within the normal range,
they provide illuminating evidence for the sources of later individual differences
in a range of skills and therefore contribute to theoretical understanding of those
The focus of the present paper is on predictions from measures of vocabulary
obtained between 16 and 30 months of age to language, literacy, and mathematics
outcomes in the later school years. The term “predictions” here refers both to
correlational and regression examination of relationships across the full range of
ability and to the outcomes of early delay. It is only fairly recently that we have had
valid measures for this early period that can be obtained for large samples of young
children and that enable examination of potential prediction from such early ages.
This study’s restriction to school-age outcomes reflects the fact that there is al-
ready a substantial body of research on longitudinal relations within the preschool
period (for reviews, see Fenson et al., 2007;Law&Roy,2008; Lee, 2011). For
example, Lee (2011) conducted a secondary analysis of data from the National
Institute of Child Health and Human Development Early Child Care Research
Network (2005). Parent-reported 24-month vocabulary on the CDI: Words and
Sentences was a moderately strong predictor (r=.36) of the Preschool Lan-
guage Scales auditory comprehension and expressive communication subscales
at 54 months. Hayiou-Thomas, Dale, and Plomin (2012) analyzed parent-report
measures (short-form adaptations of the CDI, providing a composite vocabulary–
grammar measure) obtained at 2, 3, and 4 years in the Twins Early Development
Study. Predictions from 2 to 3, from 3 to 4, and from 2 to 4, were .59, .63, and
.48, respectively. Bornstein, Hahn, Putnick, and Suwalsky (2014) observed corre-
lations in a similar range (.25–.49) from 20-month vocabulary to several 48-month
language measures. With respect to the outcomes of early delay, defined by low
scores relative to norms, a substantial body of research (see Rescorla & Dale, 2013,
for several reviews of this literature) has confirmed great variability in outcome.
Although these “late talkers” (child’s vocabulary falling within the lowest 10%
of children at that age) are at considerable risk for later problems, many of them,
perhaps half, will have moved into the normal range by the end of the preschool
period, at least on measures of vocabulary and grammar. However, performance
often remains below the mean even after this “catch-up,” and those children are
therefore at continuing risk, though no greater than other children in the low normal
range at the same age (Dale, McMillan, Hayiou-Thomas, & Plomin, 2014).
In contrast, much less evidence is available for predictions from early measures
into the school years. Three studies have reported correlations across this time
span for individual differences in oral language with large, representative English-
speaking samples. Lee (2011) found that 24-month vocabulary predicted Grades
1, 3, and 5 Woodcock–Johnson Psycho-Educational—Revised (WJ-R) picture vo-
cabulary scores at .27, .27, and .25, respectively. Hayiou-Thomas et al. (2012)
found that their age 2 vocabulary–grammar composite predicted 10-year vocabu-
lary and 12-year vocabulary, syntax, figurative language, and making inferences at
.21, .18, .17, .20, and .12, respectively. Predictions from age 3 were only slightly
higher. Bornstein et al. (2014) found correlations in the range of .20–.33 from
20-month vocabulary to age 10 verbal intelligence measures. In addition, Hohm,
Jennen-Steinmetz, Schmidt, and Laucht (2007) found in a small sample of German
children that scores on the Receptive–Expressive Emergent Language Scale at 10
months were significantly related to several measures of language performance
(including test of vocabulary, sound blending, spelling, and a sentence repetition
test) and grades at age 11 years, but almost exclusively for girls (median r=.46;
for boys, median r=.10). The sample for this study was constructed to have high
variability in risk.
One of the most important reasons for concern about early language is its poten-
tial significance for the acquisition of reading, which in turn is the most important
determinant of educational success. Reading skills are substantially based on oral
language skills, especially but not only vocabulary (National Early Literacy Panel,
2008). However, as noted by Lee (2011), few studies of the emergence of literacy
have extended beyond the period from preschool or kindergarten through second
grade. In Lee’s study, early vocabulary predicted third grade WJ-R letter–word
identification, word attack, and passage comprehension at .22, .12, and .28, respec-
tively; at fifth grade, the correlations for letter–word identification and passage
comprehension were .23 and .25. Within the Twins Early Development Study,
Harlaar, Hayiou-Thomas, Dale, and Plomin (2008) examined predictions from
early language measures to teacher ratings of reading ability, using a nationwide
evaluation rubric based on the UK National Curriculum goals for literacy. Lan-
guage scores at 2 years predicted reading at ages 7, 9, and 10 at .23, .26, and .29,
respectively. Again, prediction from age 3 was only slightly higher.
Few studies of late talkers have followed children into the school years, and
they are all studies of small samples. The most comprehensive long-term follow
up of children identified as late talkers below the age of 2 is Rescorla’s (2013)
longitudinal study. Children identified as late talkers at 24–31 months were com-
pared at later ages with a sample matched for age, socioeconomic status (SES),
and nonverbal ability (N=40 and 39, respectively). They were assessed at ages
8, 9, 13, and 17 on language, cognitive, and academic achievement. In most cases,
the two groups differed significantly, with effect sizes generally in the range of
0.5 to 1.0, though sometimes even larger. Of note, later differences were observed
for all aspects of language assessed: vocabulary, grammar, narratives, and verbal
Rescorla’s study also provides the only long-term outcome information on
literacy in late talkers. Late talkers scored lower at age 8 on a composite of WJ-
R letter–word identification and passage comprehension, and on a measure of
reading comprehension but not decoding at age 13. There were no differences
on the reading measures at age 17 despite the differences that remained on oral
language measures. Rescorla notes that very few of the late talkers scored below
the 10th percentile on any of the language or reading measures at age 17.
Taken together, the limited body of evidence presented above suggests that mea-
sures as early as the third year of life have continuing predictive significance into
the second decade, though there are some exceptions to this result. Nevertheless,
there are some specific and important gaps in the present knowledge base. First,
correlational studies of the entire distribution are with the exception of Hohm et
al.’s study of German children, to our knowledge, available only for English, and
mainly in a White, US, UK, or Australian population with exceptions such as
the National Institute of Child Health and Human Development study (National
Institute of Child Health and Human Development Early Child Care Research
Network, 2005). Languages with different structures, such as greater versus lesser
morphological complexity, might show differences in patterns of prediction, as
might societies with differing amounts of SES variability or different daycare or
school systems. Second, the primary analytic methods used have been correla-
tions and regressions, which assume a linear relationship for the prediction. It
would be useful to know if the early measures are more sensitive predictors of
outcomes in particular parts of the scale. Studies of late talkers are most often
based on relatively small clinically referred samples, or children parent referred as
late talkers from the general population. This recruitment process can miss many
children, perhaps especially those with a different profile. Another limitation to
this design is that there is generally not an explicit comparison with the outcomes
of other groups, such as the next 10%, the median 10%, or the top 10%. A few
studies, however, such as Hayiou-Thomas et al. (2012), identify late talkers using
a statistical criterion (usually the lowest 10%) within a population-representative
sample. Such a design has the power to overcome these limitations. Unlike the
overall correlational studies, late talker research has been conducted in several
languages, but none of them extend beyond the preschool period.
Studies of individual differences in early language development have consis-
tently found small gender differences favoring girls (Eriksson et al., 2012). Little
is known about gender differences in the stability of language and literacy, given
the absence of longitudinal studies with adequate sample size to examine gender
The present study is designed to address some of these limitations by obtaining
follow-up information in sixth grade (approximately 12 years of age) on a large and
diverse sample of Danish children who were assessed in the Danish CDI norming
study (Bleses et al., 2008). The Danish National Test system was implemented in
the public schools in Denmark in 2010. The test system includes 10 mandatory tests
in six subjects, administered individually to children from second grade through
eighth grade, and offers a unique opportunity to follow academic achievement of
Danish children through the period of compulsory schooling.
The Danish context is a particularly interesting one for the study of longi-
tudinal relationships. More than 95% of 3- to 5-year-olds are in daycare (http:// Childcare in Denmark (the
term preschool is generally not used) has traditionally, and uniformly, focused
on supporting children’s social skills rather than early academic skills. These
facts can be seen as specific aspects of the lower degree of SES variability in
Denmark compared to the United States overall (Organization for Economic
Cooperation and Development [OECD], 2014). According to 2015 data from
the OECD Centre for Opportunity and Equality (
inequality.htm), Denmark has the lowest gini coefficient among OECD coun-
tries, and the lowest income difference ratio of 5.2 between the richest 10% and
poorest 10% (cf. OECD mean =9.6; United States =18.8). This lower degree of
SES variability may entail less environmental variability and influence on individ-
ual differences overall during childhood. It is possible that longitudinal prediction
might be stronger in this context. However, there have been no published longitu-
dinal studies of Danish children from the early stages into the school years.
Finally, Danish is for other reasons a good comparison to the comprehensive
English literature because Danish and English are similar in important ways. Both
Danish and English are Germanic languages. In addition, like English, Danish has
a somewhat deep orthography, mainly due to substantial changes in the spoken
language that date back to the early middle ages and to a substantial influx of
foreign words and their spelling from other languages (Elbro, 2006). Furthermore,
initial reading in Danish is traditionally taught by means of a variety of instructional
methods, such as the whole-word approach, contextual cues, phonics, and easy
book reading (Elbro, 2006).
We addressed the following research questions:
1. How well does early vocabulary predict later educational outcomes, including
language, the closely related skills of literacy, and the most distantly related skills
of mathematics?
2. What is the role of SES? (a) Does prediction from early vocabulary actually
reflect the influence of SES and (b) do SES differences moderate the prediction
so that the correlations vary by SES levels?
3. How do those predictions vary with the age and gender of the child at the initial
4. Is there a nonlinearity in sensitivity, that is, regions where small CDI differences
make larger differences in prediction than in other regions?
5. What are the educational outcomes of the delayed group, and how do they com-
pare with the outcomes for the median 10% and the top 10% of early vocabulary?
Children in the current study were part of the norming study of the Danish version
of the CDI (Bleses et al., 2008), conducted in 2002 and 2003. Inclusion crite-
ria were Danish citizenship; born in Denmark; attaining the exact age of 8, 9,
10,...,or30monthsinthefirstweekofJune2002; and no reported speech,
hearing, or other serious (chronic) health problems. Parents of 3,714 children be-
tween 16 months and 36 months, all native speakers of Danish, completed the CDI
report and a background questionnaire addressing child and parent characteristics.
The sample was balanced with respect to gender (51% girls, 49% boys), but the
educational and occupational level of the parents was somewhat higher than the
national average (Bleses et al., 2008). For this paper, we restricted the sample to
children between 16 and 30 months (N=2,863), both for comparability to other
CDI projects, which generally target that age range, and because the sample was
smaller at the ages above 30 months.
The Danish Ministry of Education has made National Danish test data avail-
able for researchers. In the CDI norming study, we had not registered the civil
registration number (CPR number) that all children are assigned at birth by the
Central Office of Civil Registration. However, by using child and parent names,
with the help of the Central Office of Civil Registration, via CPR numbers, we
were able to link the original CDI data for 2,120 children to Danish National Test
scores in reading and math. Even though there is some attrition in the National
Test data set (between 2.4% and 3.4% for the National Test in reading and math;
cf. Beuchert & Nandrup, 2014), the attrition in our study was due to insufficient
information on child and parent name, which made it impossible to match child
name with the CPR. The resulting sample is balanced with respect to gender (51%
girls and 49% boys), and the imbalance of SES measures of the original sample is
maintained. Compared to all Danish children 1–3 years of age based on a sample
obtained from Statistics Denmark in 2003, our sample includes a lower percentage
of parents having obtained only a basic education (primary school or high school;
4% vs. 28%) or a short vocational education (e.g., electrician; 45% vs. 52%).
Conversely, a higher percentage of parents in our sample had a medium education
(e.g., teacher; 30% vs. 11%) or a high education such as an advanced university
education (21% vs. 9%).
Assessing early expressive vocabulary. Expressive vocabulary skills were as-
sessed using the Danish adaptation (Bleses et al., 2008) of the CDI (Fenson
et al., 2007). The Danish CDI: Words and Sentences, includes a 725-word vo-
cabulary checklist, which is organized into 22 semantic categories such as sound
effects and animal sounds, nouns (animals, vehicles, and toys), verbs, function
words, and so on. Parents mark which words they have heard their child produce.
The vocabulary checklist was used to generate a measure of the total number
of early words produced by a child. The reliability and validity of the origi-
nal CDI has been demonstrated in numerous studies, summarized in Fenson
et al. (2007) and Law and Roy (2008). Measures of internal consistency and
test–retest reliability for the CDI: Words and Sentences are generally above 0.9,
and measures of concurrent validity with examiner-administered measures are
generally between 0.7 and 0.9. For the validity of the Danish CDI, see Bleses
et al. (2008). Consistent with most other research on late talkers (Rescorla & Dale,
2013), we have defined early delay as the child’s vocabulary falling within the
lowest 10% of children at that age.
Assessing reading and math skills. The Danish National Tests, administered at
the end of the school year, are web based (i.e., students take the test online at a
computer), objective (i.e., the system chooses the items and calculates the results),
and adaptive (i.e., based on a Rasch model, each student is presented with a se-
quence of items of different difficulty levels based on performance on the previous
items where the difficulty level of the items and the pupil ability is measured on the
same logit scale). The items are multiple-choice questions with a varying number
of options. The Danish National Tests in reading evaluate students’ ability within
three areas: language comprehension (semantics of individual words, homonyms,
language use, and idioms), decoding (word identification in concatenated words
and word reading), and reading comprehension (comprehension of written texts).
All tasks within the three areas require some text reading, but for language com-
prehension, a picture task without text was used as well. The Danish National
Tests in math evaluates students’ ability in numbers and algebra, geometry, and
mathematics in use, but we analyzed only the numbers and algebra score in the
present paper because these are symbolic systems, which seem more comparable
to language than geometry or mathematics in use. This test evaluates students’
knowledge of integers, decimals, and fractions, and their skills in arithmetic,
percentage calculations, and ability to read graphs.
The Danish National Tests provide a valid estimate of student abilities (Beuchert
& Nandrup, 2014). For example, lower scores are associated with low birth weight,
special education needs, and lower SES; all together, student and parental back-
ground explain approximately 13%–21% of the variance in performance. Ninth-
grade school examinations provided an external measure of validity; when they
were regressed on the same-subject National Test data, the earlier national test
results explained 48%–49% of the Danish and math examination marks (Beuchert
& Nandrup, 2014).
Analytic strategy
Multiple regression analysis was used to determine the prediction from the early
vocabulary measure, together with age, gender, and SES, to the four educational
outcomes of language comprehension, decoding, reading comprehension, and
numbers and algebra, taken from the national sixth-grade test. SES was mea-
sured by grouping children into four categories based on their parents’ educa-
tional status as described earlier (see Participants). The regression analysis de-
termined the effect of age, gender, and SES on mean levels of performance, as
well as evaluating the unique variance contributed by vocabulary and SES. To
address the question of whether prediction was more accurate for some subgroups
that others, supplementary correlations were calculated by age, gender, and SES
To explore the nature and possible nonlinearity of the vocabulary-outcome
measure relationships, we obtained quantile curves in the following way. For each
value on a grid between –2.1 and 2.1 with a spacing of 0.025, we considered the
500 nearest neighbors, that is, the 500 children closest with their CDI score to the
given value. Within these 500 children, we determined the empirical 10%, 25%,
50%, 75%, and 90% quantile of the results from the National Test. For each of the
five quantile levels, we obtained in this way a curve of quantile values versus CDI
score values. This curve was smoothed by a Lowess smoother with a bandwidth of
Tab le 1. Hierarchical regression with age and gender in Step 1, CDI vocabulary
or SES in Step 2, and age, gender, SES, CDI in Step 3 for the four National Test
Step 1
Age & Step 2 Step 2 Step 3
Gender CDI SES All
Danish tests
comprehension .072 .005 .217 .047 .221 .049 .297 .088
Decoding .145 .021 .256 .065 .252 .063 .324 .105
comprehension .109 .012 .222 .049 .275 .076 .332 .110
Math test
Numbers & algebra .019 .000 .138 .019 .237 .056 .270 .073
Note: CDI, MacArthur–Bates Communicative Development Inventories; SES, so-
cioeconomic status; R, standardized regression coefficient.
0.2. In the corresponding scatterplots, extreme values of the National Test results
were set to the upper and lower boundaries of the range shown on the y-axis and
extreme values of the zscore were set to –3 or 3, respectively. The middle (median)
curves in each figure are the most revealing of the relationship.
To provide a visual comparison of the distribution of outcomes on each of the
sixth-grade language/literacy measures for children with early vocabulary delay
(bottom 10%, N=209) with those for children in the median 10% (45th–55th
percentile, N=215) and the top 10% (N=212), histograms for the three groups
on that measure were superimposed.
To examine the first research question, how well early vocabulary predicts lan-
guage, literacy, and mathematical outcomes, multiple regression analyses were
performed, and summarized in Table 1. When CDI vocabulary was added as a
predictor, it accounted for an additional 4.2%, 4.4%, and 3.7% of the variance in
the language and literacy measures, but only 1.9% of the mathematics measure.
When SES was added prior to CDI vocabulary, the figures were comparable, but
somewhat higher for reading comprehension and mathematics. When both were
added, the total variance accounted for was nearly as large as the sum of the two
individual variances described above. These results, suggesting largely unique
effects for vocabulary and SES are consistent with the findings (not included
in the table) that SES is almost completely uncorrelated with CDI vocabulary
(r=.032), although it is correlated with the later measures: with language com-
prehension (r=.209), with decoding (r=.205), with reading comprehension
(r=.252), and with numbers and algebra (r=.236).
Tab l e 2. Longitudinal prediction of r (N) values for four National Test scores by CDI
vocabulary for selected subgroups
Danish Language/Literacy Tests
Language Reading Numbers &
Subgroup Comp. Decoding Comp. Algebra
Age (months)
16–20 .186** (632) .162** (632) .129** (632) .114** (635)
21–25 .188** (707) .246** (707) .200** (632) .120** (703)
26–30 .242** (646) .226** (646) .249** (646) .180** (650)
Girls .185** (1025) .198** (1025) .182** (1025) .149** (1028)
Boys .229** (960) .229** (960) .209** (960) .124** (960)
SES (par. educ.)
Basic .320** (85) .203 (85) .376** (85) .149 (83)
Shorter .190** (896) .202** (896) .188** (896) .116** (896)
Medium .236** (590) .238** (590) .205** (590) .178** (593)
High .156** (414) .183** (414) .140** (414) .094 (416)
Note: CDI, MacArthur–Bates Communicative Development Inventories; SES, socio-
economic status.
To address the second part of research question two as well as question three,
supplementary correlation analyses were conducted. To explore potential changes
in the degree of prediction as children grow from 16 to 30 months, correlations
were computed within each of three age ranges: 16–20, 21–25, and 26–30 months
(see Table 2). The table also includes correlations separately for boys and girls, and
stratified by the four SES categories. The correlations generally grow modestly
across the age group, with the single exception of decoding between 21 and 25
months and 26 and 30 months. For the language and literacy measures, but not
the mathematics measure, the correlations are higher for boys than for girls. With
respect to SES, the lowest correlations were observed in families with high parental
education, but there was little other pattern.
To investigate the fourth research question, whether the predictive discrimina-
tion of small differences with the level of CDI vocabulary score, we inspected
smoothed quantile curves of the National Test outcomes in language and literacy
in dependent on the zscore (Figure 1). We restricted this analysis to children of 21
months and above, due to the very weak (“trivial” in Cohen’s system) prediction
from vocabulary below this age.
The curves, especially that for Language comprehension, appear to change in
slope over the range of the predictor. However, the apparently flat slopes at the
extremes of the curves are based on small numbers of subjects. Spline regression
analyses were performed to determine if there were inflection points (“knots”)
within the main body of the data (±2SD) where the slope of a linear regression
line changed significantly. For Language comprehension, there was a significant
decrease in the slope at approximately z=0.01, showing greater sensitivity in
Figure 1. Scatterplots of three sixth-grade National Test results versus CDI zscores obtained at 21–30 months and smoothed 10%,
25%, 50%, 75%, and 90% quantile curves.
the below-average range than in the above-average range. For both decoding and
reading comprehension, no such inflection points were detected.
Finally, examining the fifth research question, concerning outcomes for the
delayed group (lowest 10%) in comparison to outcomes for a median and top
group, Figure 2 presents histograms for the three groups for each of the language
and literacy measures.
Consistent with the correlations, children in the lowest group were more likely
to show low scores later, while verbally precocious children (top 10%) were more
likely to show high scores later. The results for the median 10% show a more
uniform distribution across the range.
Children in the early delay group had an elevated chance (22%, 19%) of remain-
ing in the lowest 10% later on language comprehension and decoding, respectively,
though not for reading comprehension (10.5%). However, a subgroup of these chil-
dren showed good development later; 40%, 34%, and 36% were in the upper half
on Language comprehension, decoding, and reading comprehension, a result that
is consistent with much other research on outcomes of very early language delay
(Rescorla & Dale, 2013). Examination of the prediction for children in the top
group shows that obtaining a very high vocabulary score early does not necessarily
imply high performance in school later. The probability of remaining in the initial
high category is even lower than for children with early delay remaining in the
lowest group: 7.5%, 8%, and 5%, respectively. However, it is very unlikely that
these children will experience problems and end up in the bottom 10%: 3.5%, 5%,
and 1.5% for the three measures, respectively.
To address the first research question of how well early vocabulary predicts later
educational outcomes, we conducted regression and correlation analyses predict-
ing vocabulary scores at 16–30 months of age with sixth-grade outcomes in Danish
language and reading as well as math. The r2values from the regression and the
simple correlations both indicate a very modest degree of prediction, at best reach-
ing the level of “weak” in terms of Cohen guidelines. The interval between early
vocabulary assessment and outcome measures constituted more than three quar-
ters of the children’s lives, and undoubtedly comprised a host of family, school,
health, and other sources of variability. As expected, the prediction was stronger
for language and literacy outcomes than for the math measure. To our knowledge,
this is the first documentation of academic achievement prediction from language
measured at this young in a large and diverse sample. It is also notable that the
prediction is significant across the board for both boys and girls (cf. Hohm et al.,
It is also illuminating to consider these results in the context of other research
on early language and emergent literacy measures and their prediction to later
academic outcomes. A comprehensive meta-analysis was conducted as part of
the report of the National Early Literacy Panel (2008; see chap. 2). Consid-
erably stronger predictions, in the range of 0.4 to 0.6, were found for some
measures, particularly aspects of print awareness and phonological awareness.
Measures of oral language yielded weaker correlations, typically around .3–.35.
Figure 2. Charts showing the percentage of children in deciles 1–10 on the Danish National
Test in sixth grade for three groups of children differing in CDI score as 16- to 30-month-olds,
namely, bottom 10%, median 10% (45%–55%), and top 10% on the CDI.
Furthermore, within the domain of oral language, vocabulary was less predic-
tive than other measures such as grammar. Lonigan and Shanahan (2010) argue
that these findings show that oral language is necessary for literacy, but far from
sufficient. It should also be kept in mind that the overwhelming majority of the
studies summarized by the National Early Literacy Panel included children 4
years of age and older. Though there was little evidence that the predictions
from 4-year-olds and those from kindergartners were different, it is likely that
prediction from 2-year-olds would be different. The correlations obtained here
(.13 to .25) are similar to those obtained in more recent research in English fol-
lowing the National Early Literacy Panel report, and discussed earlier by Lee
(2011), Harlaar et al. Plomin (2008), and Hayiou-Thomas et al. (2012), even
though we examine a different language and consider a longer time span. These
results thus also suggest that the relatively smaller SES variability in Denmark
compared to the United States and Danish children’s almost 100% daycare atten-
dance is not enough to eliminate long-term effects of variability in early language
What is the theoretical and applied significance of these correlations? From a
theoretical perspective, it is important to acknowledge that these correlations are
not necessarily evidence of causality. Early vocabulary differences could causally
initiate a cascade of differences in grammar, pragmatics, metalinguistic awareness,
literacy, and intellectual development. However, early vocabulary differences and
later academic differences could be independent manifestations of underlying
abilities or of general neuropsychological functioning (see, e.g., Scarborough,
From an applied perspective, these weak correlations are not strong enough
to justify clinical decision making on their own. At present, late talking in
children below 2 years 6 months is not sufficient in itself to justify speech–
language therapy (Rescorla & Dale, 2013). However, based on promising results
of early low-cost interventions in homes (e.g., van Steensel, McElvany, Kurvers,
& Herppich, 2010) and by increasing the instructional quality in daycares (e.g.,
Keys et al., 2013; National Institute of Child Health and Human Development
Early Child Care Research Network, (2005), the results do warrant that late talk-
ers (in particular if late talking is combined with other indices of increased risk,
such as family history of language/literacy learning difficulties, low SES, and
poor comprehension) receive some kind of increased educational support in their
homes or in daycares.
A further perspective on the magnitude of these predictive correlations, weak
but significant given the large sample size, is provided by a comparison with other
variables that are related to educational outcomes at this stage of development,
particularly reading. Our observed correlations in the range of .13 to .25 are
comparable in size, based on percentage of variance accounted for, to a Cohen
deffect size measure for an intervention of 0.26 to 0.52, that is, the small to
moderate range. Utilizing summary statistics provided by Lipsey et al. (2012)
based on American studies, we can say that this effect size is comparable to 1
year’s mean growth in reading scores (Grades 5–6, d=0.32; Grades 6–7, d=
0.23). This is somewhat smaller than the eligible–ineligible for free/reduced lunch
difference on the National Assessment of Educational Progress scores (d=–0.45
to –.74), and smaller still than ethnic differences (Black/Hispanic/White) on the
National Assessment of Educational Progress (d=0.53 to –.83). These statistics
are a reminder of how little we can explain of individual differences in reading
overall at present.
For our second research question, we wanted to investigate the role of SES in
predicting later educational outcomes. Three aspects of the results suggest that
SES and CDI vocabulary play largely independent roles. CDI vocabulary itself
is not significantly correlated with SES. Adding SES as a predictor (Table 1)
after CDI vocabulary had nearly as large an effect on the prediction as it had
when it was entered prior to vocabulary. Finally, examination of prediction from
vocabulary within the four SES categories showed little variation. Together, these
results indicate that early vocabulary is an SES-independent predictor. However,
the generalizability of this conclusion is unclear because it rests upon the lack of a
correlation between SES and early CDI vocabulary scores. In another Danish study
with a more representative sample in terms of parental education, substantial SES-
related differences in child outcomes were observed (Bleses, Højen, Jørgensen,
Jensen, & Vach, 2010); that is, the lack of SES differences in the current study
could be due to the biased nature of the sample.
The third research question concerned how these predictions vary with the age at
first measurement and gender. There is a growing consensus that early vocabulary
measures are credible sources of information of child language development. Our
results show that a vocabulary measure taken as early as the second year of life is
a significant predictor of language and literacy and, less so, of math skills. With
respect to gender, we found higher correlations for boys than for girls for all three
Danish language outcome measures. We had made no predictions at this question,
and the result is opposite to that of Hohm et al. (2007); further research is needed
to clarify the reasons for this difference.
Examination of scatter plots and smoothed quantile curves investigating the
fourth research question suggested the lower range of zscores is more sensitive
with respect to long-term prediction, but spline regression analysis confirmed this
statistically only for language comprehension.
Finally, addressing the fifth research question, we examined the outcomes of
early delay. This is one of the very first studies of the long-term academic outcomes
of a large and diverse sample of late talkers. In contrast to the findings of Rescorla
(2013), children with an early delay were more likely to continue to have problems
(overrepresentation in both the bottom 10% and the bottom half of the distribution),
although it was also true that about a third of the late talkers moved into the upper
half of the distribution. Precocious children were not as likely to remain in the top
10% group; however, their chances of ending up with reading problems were low.
Limitations and future directions for research
All the children in this study were monolingual Danish speakers at the time of
initial assessment. As societies become increasingly multilingual, there is a strong
need for similar kinds of longitudinal information for children who are growing
up in a bilingual context. For these children, it is even more challenging to make
clinical decisions about language impairment or risk of language impairment.
The main finding of this study is that educational outcomes can be predicted from
a very early (below 2 years) measure of language development, namely, the size
of the parent-reported productive vocabulary with effect sizes (in proportion of
variance accounted for) comparable to 1 year’s mean growth in reading scores.
However, there is great variability in outcome. Supplementation with other rel-
evant information, such as family history of learning difficulties, low SES, and
poor comprehension, may somewhat improve the prediction. Intervention studies,
however, have demonstrated that children’s language development can be acceler-
ated by parent-based interventions in the toddler and early preschool range or by
increasing the instructional quality of daycares. Such low-cost, and ideally low-
stress, interventions may be highly appropriate responses to late talking. However,
more basic research on the mechanism by which early vocabulary differences re-
sult in later academic differences is needed. Understanding which associations are
truly causal versus reflecting common underlying factors is essential for selecting
optimal targets for intervention.
We extend warm thanks to parents for their contribution to the original CDI study. The
MBCDI study was funded by a grant from the Carlsberg Foundation and the Research
Council for the Humanities.
