Language, cognition, and short-term memory in individuals with Down syndrome

The developmentally emerging phenotype of language and cognition in individuals with Down syndrome is summarized on the basis of the project's prior work. Identified are a) the emerging divergence of expressive and receptive language, b) the emerging divergence of lexical and syntactic knowledge in each process, and c) the emerging divergence within cognitive skills of auditory short-term memory and visuospatial short-term memory from other visuospatial skills. Expressive syntax and auditory short-term memory are identified as areas of particular difficulty. Evidence for the continued acquisition of language skills in adolescence is presented. The role of the two components of working memory, auditory and visual, in language development is investigated in studies of narrative and longitudinal change in language skills. Predictors of individual differences during six years of language development are evaluated through hierarchical linear modelling. Chronological age, visuospatial short-term memory, and auditory-short term memory are identified as key predictors of performance at study entry, but not individual change over time, for expressive syntax. The same predictors account for variation in comprehension skill at study outset; and change over the six years can be predicted by chronological age and the change in visuospatial short-term memory skills. (Research funded by US National Institutes of Health Grant R01-HD23352 with contributions from the National Down Syndrome Society.)

Download PDF

Chapman, R, and Hesketh, L. (2001) Language, cognition, and short-term memory in individuals with Down syndrome. Down Syndrome Research and Practice, 7(1), 1-7. doi:10.3104/reviews.108


Children with Down syndrome have been described as having phenotypically distinct behavioral patterns in language and cognition, and the specific patterns appear complementary to patterns observed in Williams syndrome: high expressive language and auditory-short term memory skills in individuals with Williams syndrome, relative to nonverbal cognition; and low expressive language and auditory short-term memory skills in individuals with Down syndrome, relative to nonverbal cognitive skills. These differing patterns implicate auditory-short term memory as a possible causal correlate of expressive language skill, when language production and comprehension diverge; or, alternatively, a consequence of the expressive language skill.

Here we explore these questions more fully in summarizing our studies with children, adolescents, and young adults with Down syndrome, asking:

  1. What is the specific behavioral phenotype, in more detail?
  2. Is auditory short-term memory as delayed as expressive language skill?
  3. Is this delay, following the Baddeley model of working memory (1992), attributable to a slower speaking rate? or to poorer serial memory skill, than individuals matched for auditory memory?
  4. Does auditory short-term memory predict expressive language skill?
  5. Does auditory memory continue to predict language skill when other variables that we know are related to the development of language skills are added to the models, such as chronological age, hearing, and spatial cognitive skills (Chapman, Schwartz & Kay-Raining Bird, 1991; Chapman, Seung, Schwartz, & Kay-Raining Bird, 2000)?

The behavioral phenotype in Down syndrome

In recent reviews of the literature (Chapman & Hesketh, 2000; Chapman, 1995, 1997a, 1997b), we have summarized our own and others' evidence for a specific behavioral phenotype associated with Down syndrome, or Trisomy 21, that emerges developmentally. Thus, the chronological age of participants will be a crucial factor in the pattern of language and cognitive skills that is observed; Table 1 summarizes this emergence by age.



  • Delayed development of:

inhibitory processes in learning

sensory-motor cognition

canonical babbling

  • Fewer nonverbal requests than mental age controls
  • Slowed acquisition of spoken vocabulary relative to comprehension


  • Interest in face-to-face social interaction
  • Gestural communication
  • Visual memory


  • Deficits in:

auditory short-term memory compared to mental age

communication skills relative to activities of daily living and socialisation

emergence of spoken sentences compared to mental age

  • More omission of:

grammatical function words compared to language production level

verbs compared to language production level

  • Language input less likely to contain:

internal state words

  • Errors of sound production more variable
  • Intelligibility problems


  • Comprehension keeps up with nonverbal cognition


  • Deficits in:

working memory: executive functions in verbal and visual, traceable to problems in inhibiting currently active schemas; auditory and visual short-term stores

  • Expressive language delay compared to mental age and comprehension levels
  • Syntax comprehension delay emerges, compared to mental age and vocabulary comprehension
  • Sentence structure more delayed than vocabulary in both comprehension and production


  • Comprehension vocabulary can exceed nonverbal cognition, with experience
  • Language learning continues throughout late adolescence and young adulthood in both comprehension and production
  • Almost half of individuals studied acquire literacy
  • Intelligibility improves with chronological age and continued therapy work

Table 1 | Summary of the Developmentally Emerging Phenotype of Communication in Down Syndrome (based on Chapman and Hesketh, 2000). Note that degrees of delay vary widely.

Infancy and Toddler Years

During infancy and the toddler years, cognitive learning delays accelerate with age, a slower transition from babbling to speech is seen, and poorer intelligibility when speech emerges. Nonverbal requesting is less frequent than one would expect, given the level of cognitive skills; work reported by Moore and Oates (2000) suggest that requesting emerges, however, at the expected time. First single word and first two word-combinations may also emerge at expected cognitive stages, but expressive language development is slowed thereafter, in both cumulative vocabulary and development of syntax as indicated by mean utterance length. Cumulative vocabulary is delayed even when signing skills are taken into account. (Parenthetically: sign language is often used in language therapy with these children to bridge the period when children are attempting communication but lack intelligibility). Comprehension skills, in contrast, are commensurate with nonverbal mental age.


In childhood, specific deficits in verbal short-term memory become apparent. Speech development includes a longer period of phonological errors and more variability; as well as poorer intelligibility, which is associated in part with hearing status. Expressive language delays continue relative to comprehension and cognition. In terms of adaptive behavior, individuals with Down syndrome show fewer behavior problems than control groups with other cognitive disabilities. Problems that do occur, such as anxiety, depression, and withdrawal, may become more evident with increasing age.


In adolescence, the specific deficits in verbal short-term memory continue, and specific working memory deficits become evident as individuals are able to do the tasks that reveal them, such as backward digit recall. Further, tasks of spatial cognition that involve delayed recall and reconstruction of visual sequences, as in the Bead Memory subtest of the Stanford-Binet (Thorndike, Hagen & Sattler, 1986), become selectively harder than tasks requiring visual analysis and reproduction of spatial models, as in the Pattern Analysis subtest of the Stanford-Binet. Thus a problem arises in matching the group cognitively, as one might want to do in comparisons with other syndromes. A match made on the basis of Pattern Analysis, in adolescence, would lead to a mismatch on the basis of Bead Memory; and, of course, to the extent that cognitive tasks relied on working memory and verbal memory, additional mismatch could arise.

In speech, more variability in fundamental frequency, rate control, and placement of sentential stress is seen. Intelligibility, in our studies, improves with chronological age and hearing status ( Chapman et al., 2000 ). In expressive language, narrative language syntax and vocabulary continues to be delayed relative to cognition, with the deficit in syntax greater than the deficit in the lexicon (Chapman, Seung, Schwarz & Kay-Raining Bird, 1998). Comprehension of words is typically more advanced than cognitive level would predict, however, and better than syntax comprehension (Chapman, Schwartz & Kay-Raining Bird, 1991). We've interpreted this difference to reflect the greater life experience, and opportunities for vocabulary learning, in the group with Down syndrome. Syntax comprehension, in contrast, begins to lag behind nonverbal cognition, when the mean of the two Stanford-Binet subtests is used as an index of cognition ( Chapman et al., 1991 ).

Continued Learning in Adolescence

Despite the divergence of comprehension and production skills, language learning continues in both processes for older adolescents and young adults ( Chapman et al., 1991 ; 1998). In particular, language production skills do not stop with the onset of adolescence, or plateau with simple sentence structure, as studies with small numbers have suggested (Fowler, Gelman & Gleitman, 1994). The view that language development plateaued arose both because of the wide individual differences in adolescents' rate of progress in expressive language, and because language samples did not always include the narrative tasks that reveal higher levels of complex sentence construction (Chapman, 1999; Chapman, et al., 1998 ).

Cognitive-Language Interactions

We have learned, in recent work, that expressive language in adolescents' narratives differs in some important ways from those of typically developing children matched for language level by Mean Length of Utterance (MLU). In sentence structure, grammatical words are more often omitted ( Chapman et al., 1998 ), and verbs are more likely to be omitted than other content words (Hesketh & Chapman, 1998). Complex sentence structure is as likely to be found as in the control group narratives, however (Thordardottir, Chapman, & Wagner, in press). In story content, a very different picture emerges; in recounting a 6-minute wordless film of The Pear Story, individuals with Down syndrome include as many episodes as the control group matched for comprehension skills, despite the more limited language in which events are described (Boudreau & Chapman, 2000). And in narrating Frog, Where Are You?, a wordless picture book, adolescents with Down syndrome report elements of plot and search theme as frequently as the comprehension-matched group, and more frequently than the MLU-matched controls (Miles & Chapman, in preparation).

Content versus Form

Thus, the content that children with Down syndrome are working to express is more similar to their language comprehension levels than to the formal aspects of their expressive language. [Parenthetically, Bellugi and her colleagues ( Bellugi, Lichtenberger, Jones, Lai, & St George, 2000 ) have suggested that the content of the more elaborated language of children with Williams syndrome is more in line with cognitive levels than language levels.] These observations underscore the multiple factors that contribute to children's talk, and the potential divergence of content and form within a process domain. We might expect that auditory memory would correlate with form, rather than content, in these divergences.

Auditory Short-Term Memory and Expressive Language Deficits

What is the evidence that auditory short-term memory skills are commensurate with expressive language deficits, as indicated by MLU? We have found that typically-developing children matched for MLU to a Down syndrome group had equivalent auditory short-term memory spans (Seung & Chapman, 2000), as indexed by the Digit Span task of the Illinois Test of Psycholinguistic Abilities (Kirk, McCarthy & Kirk, 1968). Thus, the delays in expressive language and auditory short-term memory are of similar magnitude.

Auditory Short-Term Memory and Speaking Rate

In Baddeley's model of working memory (1992), an articulatory loop refreshes a phonological store. The amount of material in this rehearsal loop corresponds roughly to a 2 second window of speaking; as speaking rate increases with chronological age, the amount that can be rehearsed increases. Thus, it might be the case that auditory short-term memory deficits are arising because individuals with Down syndrome are speaking not only in shorter sentences, but more slowly, than one would expect. Seung and Chapman (2000) investigated this possibility in acoustic measurements of speaking rate in the digit span task, and showed that speaking rates did not account for differences in auditory short-term memory span; indeed, they did not differ across comparison groups, although latency of response did.

Auditory Short-Term Memory and Serial Processing

Rosin, Swift, Bless, and Vetter (1988) proposed that problems in language and speech reflected a more general problem with serial processing. Kay-Raining Bird and Chapman (1994) investigated this proposal by asking whether memory for order of items was selectively impaired, for digit span matched groups, in tasks of digit span recall, story recall, and the Stanford-Binet Bead Memory task of recreating a string of beads after removal of the model and a short delay. The answer was no: there was no evidence of special difficulty in remembering the order of items, apart from item memory itself.

Auditory Short-Term Memory as a Predictor of Vocabulary Language Learning

Thus, we have evidence that auditory short-term memory is as delayed as expressive syntax, and that the delay is not explainable by slower speaking rate or generally impaired memory for serial order . Do auditory memory skills predict language learning? And do they predict it selectively for production rather than comprehension? To answer these questions, we investigated the predictors of fast mapping skill in the learning of novel words (Chapman, Miller, Sindberg & Seung, 1996). This task presents novel words for novel referents once, twice, or a few times in an incidental learning situation. When only a single novel word is presented, little difference is seen in the performance of individuals with Down syndrome and mental age controls ( Chapman, et al., 1990 ). With additional items, deficits appeared in learning for the individuals with Down syndrome ( Chapman, et al., 1996 ). Both comprehension and production of novel words were tested and scored, and stepwise regression was used to evaluate five predictors of the two measures of word learning skill. The predictors selected were age, cognition, auditory memory, syntax comprehension (the Test for Auditory Comprehension of Language - Revised or TACL-R, Carrow-Woolfolk, 1985), and syntax production, or MLU. Of these, the comprehension test predicted fast-mapping in comprehension; and auditory memory predicted fast mapping in production, indicating a particular tie between auditory memory and expressive vocabulary learning for both individuals with Down syndrome and children who were typically developing.

Auditory Short-Term Memory as a Predictor of Expressive Language: Longitudinal Hierarchical Linear Modelling

Most recently, we have used Hierarchical Linear Modelling (HLM) (Raudenbush, Bryk, Cheung & Congdon, 2000) to ask about the relationship of auditory short-term memory to syntactic language skills in 31 participants with Down syndrome who took part in a six-year longitudinal study in which language skills were assessed at two-year intervals, for 4 test times (Auditory Memory was assessed only for the latter three time periods) (Chapman & Hesketh, in prep.) In two preliminary HLM analyzes, we carried out a first-level analysis fitting individuals' dependent variable scores , production (MLU), or comprehension (TACL-R) , to linear equations that predicted individuals' growth trajectories as a function of time in the study, plus a constant. This modelling yielded an intercept and slope for each individual, accounting for 65% of the variance in MLU, and 73% of the variance in TACL-R, when compared to a model containing the intercept alone. At a second level of analysis, we then evaluated models predicting the intercepts and slopes derived from fitting the person-level variables at level 1. Predictors examined for intercept and slope included: Chronological age at start of study, Gender, Mean Bead Memory across 4 times, Mean Pattern Analysis across 4 times, Mean Auditory Short-Term Memory across 3 times, Hearing Status, and a rate index, computed as Time 4 minus Time 1, for Bead Memory, Pattern Analysis, and Auditory Short-Term Memory (T4-T2).

Best Models Predicting Growth Trajectories

The best models predicted 72% of the variance in level 1 parameters for MLU, syntax production; and 79% of the variance in level 1 parameters for TACL-R, syntax comprehension. For MLU growth trajectories, the best model of individual status at study start included the following significant predictors: chronological age at start of study, mean Bead Memory, and mean Auditory Short-Term Memory. Gender was the only variable predicting slope, but it was correlated with chronological age. For syntax comprehension, the best model of individual status at study start included the same predictors: Age, mean Bead Memory, and mean Auditory Short-Term Memory. Two significant predictors of individual growth trajectories were found: chronological age at start of study and Bead Memory difference scores.

Thus, auditory short-term memory is related to individuals' performance at study outset (intercepts); but not to subsequent rate of change (slopes), for both production and comprehension skills. Rate of change in production skills is predicted only by gender. In contrast, rate of change in syntax comprehension is predicted by both chronological age - a slowing, with age; and difference scores in the visual short-term memory task of Bead Memory. We conclude that auditory short-term memory plays a significant role in predicting both expressive and receptive language; but age and visual short-term memory also contributed to the explained variance; and the latter two, unlike auditory short-term memory, contribute to predictions of rate of change in comprehension.

Implications for intervention

There are several strong implications for intervention from the findings from our studies.

1) The first is that language intervention should continue in adolescence and young adulthood for individuals with Down syndrome. This is a position long argued in the pages of the this journal (Buckley, 1993); our longitudinal data shows evidence of continuing gains, when narrative language tasks sampling more complex syntax are used; as well as wide individual differences.

2) The finding of divergent development in different processes and domains warrants a second inference: that goals for comprehension and production should be set separately, as those for comprehension will be at higher linguistic levels than those for production. Goals for receptive vocabulary learning should be dictated by environmental demands, social and cognitive skills, and the individual's interests an comprehension; our data show receptive vocabulary outstripping cognitive levels in adolescence. Goals for receptive syntax should be extended to the individual's next developmental stage and comprehension needs and interests.

3) The vocabulary deficits in production are those of performance, rather than competence: increased automaticity of vocabulary production, and methods to increase activation of vocabulary should improve access. These methods include practice, increased wait time, prior priming of word forms and content. Compensation for the reduced effectiveness of auditory short-term memory is desirable, e.g. through visual presentation of both form (reading) and meaning (context).

4) Our data show that verbs, and grammatical words, are particularly likely to be omitted in talk, compared to expressive language-level controls; thus, these forms should be particularly considered in goals for syntax production.

5) Goals for syntax production should be extended not only to the next developmental level and functional need, but also to the levels of comprehension and cognitive functioning.

6) Language intervention should be embedded in programs of academic, vocational, social, and cognitive enrichment: learning of new language vocabulary, genre, and uses is a lifelong task.

Implications for research

Current work

Our work also carries implications for new research goals, which we are currently pursuing.

1) The finding that deficits in auditory short-term memory affect new word learning in production needs to be followed up by studies manipulating practice and contextual support, to show that increases here can differentially improve fast mapping and subsequent use of vocabulary.

2) The finding that verbs are more frequently omitted in narrative language tasks needs to be followed up by studies of fast mapping and subsequent use of verbs, as opposed to nouns.

3) The finding that changes in visual short-term memory play a significant role in the rate of language learning in adolescence requires follow-up. Our work is focusing on the role of the visual context as a contributor to fast mapping skills.

The genetics of expressive language deficit

More generally, our work points to a multi-factor theory of the language deficits in individuals with Down syndrome: one factor associated with general cognitive delay; a second associated with the age-related decline in visual short-term memory; a third associated with auditory memory deficits; a fourth, possibly associated with gender or chronological age, that affects the rate of change in expressive language syntax (see Chapman & Hesketh, 2000). Genome research has currently sequenced Chromosome 21; identification of the functions of the genes is proceeding; individual differences in patterns and trajectories of language function may ultimately be linked to variation in a handful of genetic cascades, together with variation in language learning environments (Chapman, 2000).

Models of working memory

Our research also carries implications for models of working memory. We have conceptualized working memory within Baddeley's (1992) model, in which separate visuospatial scratchpad and phonological short-term memories are co-ordinated by a central executive; and in which the phonological store is refreshed by an articulatory loop. Our finding that auditory short-term memory deficits exist is not new (see Chapman's 1995 review). The evidence that general sequential processing difficulties do not explain them, coupled with the finding that auditory short-term memory predicts deficits in learning of new production vocabulary and syntax production generally, builds a stronger case for the role of the phonological component in expressive language deficit. However, increases within auditory short-term memory, within the Baddeley model, are explained by increases in speaking rate, and, hence, the amount that the articulatory loop can rehearse and refresh; but that explanation does not appear to account for the auditory deficits in our studies, where rehearsal did not occur. Nor does the model account for improvements in auditory short-term memory prior to the emergence of intentional rehearsal- often around age 7, in typically developing children, at the onset of higher cognitive levels of monitoring learning. The fact that auditory short-term memory improves throughout earlier developmental ages, lacking rehearsal, is a problem for the original model.

Recent modifications proposed by Gathercole and Baddeley (Gathercole, 1998, 1999; Baddeley, Gathercole & Papagno, 1998) have acknowledged the role of long-term learning in short-term stores: word-likeness affects nonword repetition, and the level of language learning, in bilingual children , affects auditory short-term memory performance (Thorn & Gathercole, 1999). More generally, neuroimaging studies have reinforced a view of recall that suggests that the original activation of seeing, hearing, or acting upon events are re-activated in recall; thus, short-term memory would never be independent of prior experience.

Proposal for an automatic articulatory loop

We believe an additional modification of the model is desirable, one that posits an "articulatory loop" that arises automatically from the initial mapping of production onto perception, so that subsequent speech perception feeds activation forward to the area of early motor planning. This loop would activate automatically, and provide a mechanism whereby, after the initial sound-speech mappings of infancy, "practice" in language production could arise in the context of comprehension tasks, if limited by other processing demands and the speed of speech (Seung & Chapman, submitted). Additionally, in practiced speaker-listeners, a cerebellar link might be implicated. Imaging studies of speech-motor area activation during listening tasks could evaluate this proposal, and its corollary: that in cases of expressive language deficit, reduced activation, or more slowly developing activation, arrives at motor planning areas. It has the attraction of linking auditory span to long-term learning, providing a non-intentional route for production "practice" and explaining the fact that, even though expressive language lags behind comprehension, variation in language production is largely predicted by variation in comprehension ( Chapman et al., 1998 ).

In conclusion

A deficit in expressive language learning and syntax in children with Down syndrome emerges gradually with chronological age and includes much individual variation. The linear component of individual growth is predicted, in our sample, by chronological age at start of study and two measures of short-term memory: one auditory, the other visual. Rate of change in the visual component also predicts rate of change in comprehension. Implications for language intervention and research arise from these findings.


This research was supported by NIH Grant R01 HD-23353 and funds from the National Down Syndrome Society. The help of the participants is gratefully acknowledged. Collaborators on this work reported here include Dr. Linda Hesketh, Dr. Donna Boudreau, Dr. Hye-Kyeung Seung, Dr. Elizabeth Kay-Raining Bird, Dr. Scott E. Schwartz, Dr. Giuliana Miolo, Sally Miles, and Heidi Sindberg. We thank Dr. Doris Kistler for statistical assistance.


Dr. Robin S. Chapman • Waisman Center, 1500 Highland Avenue, Madison, Wisconsin, 53705, USA. • E-mail: