Phonetic intelligibility testing in adults with Down syndrome

The purpose of the study was to document speech intelligibility deficits for a group of five adult males with Down syndrome, and use listener based error profiles to identify phonetic dimensions underlying reduced intelligibility. Phonetic error profiles were constructed for each speaker using the Kent, Weismer, Kent, and Rosenbek word intelligibility test[1]. The test was designed to allow for identification of reasons for the intelligibility deficit, quantitative analyzes at varied levels, and sensitivity to potential speech deficits across populations. Listener generated profiles were calculated based on a multiple-choice task and a transcription task. The most disrupted phonetic features, across listening task, involved simplification of clusters in both the word initial and word final position, and contrasts involving tongue-posture, control, and timing (e.g., high-low vowel, front-back vowel, and place of articulation for stops and fricatives). Differences between speakers in the ranking of these phonetic features was found, however, the mean error proportion for the six most severely affected features correlated highly with the overall intelligibility score (0.88 based on multiple-choice task, 0.94 for the transcription task). The phonetic feature analyzes are an index that may help clarify the suspected motor speech basis for the speech intelligibility deficits seen in adults with Down syndrome and may lead to improved speech management in these individuals.

Down Syndrome Research and Practice (2008)


People with Down syndrome tend to have relatively poor speech intelligibility, a characteristic frequently used to describe their speech patterns. There is wide agreement that a speech intelligibility deficit is a key component of the behavioral phenotype of Down syndrome (see refs 2,3). Reviews of this speech intelligibility deficit (see refs 4,5,6) indicate that it is related to impaired speech sound articulation, with prevalence rates of 95-100% being common, as well as atypical prosody patterns, speech fluency, and voice production. Relatively few investigators have directly examined speech intelligibility in people with Down syndrome, and those who have studied it were investigating linguistic factors that affect dysfluent speech patterns [7] , the broad communication characteristics of older children and adolescents [8] , or the language profiles of children and adolescents [9] . Although these investigations were not designed to directly investigate speech intelligibility, Willcox, and Rosin and her colleagues, found 65% intelligible speech, indicating that older children and adolescents with Down syndrome are unintelligible about 35% of the time [7,8] . Chapman and her colleagues found that speech intelligibility increased with age, a finding similar to that of typically developing children and adolescents [9 ]. In these accounts, and in other studies, reports of the type and severity of the speech problems differ greatly; however, it is clear that reduced speech intelligibility compromises communication between people with Down syndrome and their families [10,11] .

Reduced speech intelligibility may also negatively affect spoken language. Miller and Leddy argued that reduced speech intelligibility interferes with successful message communication, which may cause the person with Down syndrome to reduce speaking attempts, and thus miss the opportunity to practice spoken language production skills needed for learning language [12] . Fowler argued that lack of productive language practice limits language learning such that people with Down syndrome only use simple sentence patterns [13] , and numerous researchers have documented an expressive language deficit in people with Down syndrome [14-18] . Reports also suggest that reduced speech intelligibility appears to be related to short utterance length and spoken language complexity of speakers with Down syndrome, and that when speech therapy treatments improve speech intelligibility there is a clinically meaningful improvement in communicative effectiveness and spoken language [5 , 19-23] .

Intraoral pressure
Build up of air inside the mouth creating the energy necessary for the production of many speech sounds

Formant or formant frequency
Frequency region, in vowels and resonant consonants, in which a relatively high degree of acoustic energy is concentrated. Provides information on the position of the articulators

Formant transition pattern
The frequency movement of the acoustic energy (or Formants) between consonants and vowels. This reflects changes in position of the articulators

Laryngeal videostroboscopy
A method of viewing and recording vocal fold vibration during speech production

In addition to investigating the relationship between speech intelligibility and spoken language deficits, researchers and clinicians have attempted to explain the nature of the speech intelligibility impairment in people with Down syndrome, including the many possible aetiological and associated factors that may contribute to the deficit. Numerous factors have been associated with reduced speech intelligibility in persons with Down syndrome, including hearing loss, anatomical and physiological differences in the vocal tract, and underlying nervous system differences (see refs 4,5,6, 24,25). The suggestion that speech intelligibility and articulation errors of people with Down syndrome were related to impairments in the speech-motor control system, first proposed in the late 1970s and early 1980s, received minimal support initially [26,27,28] . Miller asserted, however, that the presence of speech motor control deficits accounted for the finding that spoken language production lagged behind comprehension in young children with Down syndrome [29] . To date, there are only a few published studies investigating motor speech impairments in people with Down syndrome using techniques used in other typical perceptual studies. These studies have supported the view that individuals with Down syndrome exhibit speech characteristics consistent with a clinical diagnosis of a motor speech impairment, although the nature of this impairment is unclear. For example, Borghi's data demonstrated voicing errors consistent with motor speech impairment [30] . Kimmelman, Swift, Rosin and Bless found highly variable formant transition patterns and Swift, Rosin, Khdir and Bless reported that people with Down syndrome have difficulty maintaining adequate intraoral pressure for speech [31,32] . Additional reports of deficits in speech timing and prosody (see refs 33,34) and data from Leddy's laryngeal videostroboscopy study also support a suspected impairment in motor speech production [35] . More recently, it has been argued that inconsistent speech production patterns observed in children with Down syndrome are related to difficulties planning speech, more specifically a deficit "in the ability to assemble phonological plans for word production" [36:p.314-315; 37] .

Kumin has reported that many children with Down syndrome exhibit difficulties and inconsistencies in speech production that are consistent with the diagnostic label of developmental apraxia of speech[24,25, 38] . Evidence that speech intelligibility deficits are related to difficulties with speech planning and motor speech production in individuals with Down syndrome continues to grow. Overall, the literature suggests that some individuals with Down syndrome may exhibit patterns consistent with dysarthria, some may exhibit apraxia, and some may exhibit symptoms of both.

Most clinicians and researchers agree, "Intelligibility is considered the most practical single index to apply in assessing competence in oral communication" [39: p.183] . Careful examination of the structure and characteristics of speech produced by these individuals is needed in order to elucidate the phonetic and acoustic factors that contribute to intelligibility deficits in persons with Down syndrome. Understanding the reasons behind reduced intelligibility has a direct impact on intervention programs, especially for adolescents and adults. With a better understanding of the nature of the speech intelligibility deficit in people with Down syndrome, master clinicians can better design speech and language treatment programs for this population.

The present study was designed to examine speech intelligibility in Down syndrome using phonetic error analysis methodology. Applying a phonetic analysis approach to study the speech production patterns of speakers with Down syndrome offers an opportunity for a systematic investigation of the potential reasons underlying speech intelligibility deficits, as well as a quantitative analysis of production and perceptual characteristics of the speech signal.

The Kent, Weismer, Kent and Rosenbek single word intelligibility test targets 19 specific phonetic contrasts that have been identified as problematic for speakers with motor speech disorders and are likely to have a significant impact on overall speech intelligibility [1] . This test was selected because it was designed to allow for identification of reasons for the deficit, quantitative analyzes at varied levels, and sensitivity to potential speech deficits across populations. In addition, results can be used to guide additional assessment measures as well as treatment goals/protocols and are interpretable within standard articulatory tests. Prior analyzes using the Kent et al. word-intelligibility test [1] have demonstrated that different phonetic error profiles may underlie similar global intelligibility deficits, that certain gender effects may be prominent in some profiles, and that the relationship between disease type and the phonetic error profiles may be quite complex [40-45] .

The primary purpose of this study was to (1) provide a measure of overall intelligibility (percent correct on a word identification test) for adults with Down syndrome, and (2) use listener based error profiles from the Kent et al. single word test [1] to identify phonetic dimensions underlying reduced intelligibility.


Procedures for this study were approved by the Institutional Review Board at the University of Wisconsin-Madison. Speakers and listeners were compensated monetarily for their participation.


Speakers in the present study included five adult males with Down syndrome selected from a database maintained by the second author (Leddy). All speakers lived in group home settings within the community and most held part-time jobs. Speaker characteristics, including age in years, adaptive function age, and speech intelligibility are shown in Table 1. The Adaptive Function Age Equivalency Scores from the Vineland Scales [46] were completed as part of a separate study by the second author [35] . Intelligibility scores were based on the single-word intelligibility test described below.

Speaker Age in Years Adaptive Function Age (Vineland Scales) Percent Intelligibility (standard deviation)
DS01 29 9;06 57.73 (8.56)
DS02 27 14;06 47.19 (11.97)
DS03 26 17;06 54.8 (6.92)
DS04 39 18;11 75.0 (5.65)
DS05 36 15;11 41.05 (15.89)

Table 1 | Speaker age, adaptive function age, and speech intelligibility

Speech sample

The intelligibility test used in the present study consisted of 53 single words. The words were selected from the Kent, et al. single word intelligibility test [1] , which was designed to look at 19 phonetic contrasts thought to be susceptible to errors by speakers with motor speech disorders. Although this test was not designed specifically for persons with Down syndrome, it is the only published test that allows for a thorough and systematic examination of features underlying an intelligibility deficit. In the test, each word is chosen to contrast with three foil words on one of 19 phonemic features. This allows for interpretation of confusions in the listener's responses with respect to the features used in constructing the list of test words. For example, for the target word bad the response choices for the listeners were bad - bed - bat - pad . The first foil, bed, differs from the test word in the tongue-height feature of the vowel that forms the syllable; the second choice, bat, differs from the test word in the voicing feature of the final consonant; and the fourth choice, pad differs from the test word in the voicing on the initial contrast. The original Kent et al. test included 72 different target words [1] ; however, for the present study a subset of 53 words was selected. Selection was based on word familiarity and the reading level of the speakers; at least 5 tokens from each of the 19 contrasts were included in the shortened version of the list. A list of the target words and corresponding foils by phonetic category can be found in Appendix A. The target word is listed first in each word pair.

Data recording

X-ray microbeam study of speech kinematics
A method of looking at the movement of the articulators during connected speech

An initial session was scheduled with each subject and the investigators (first and second authors) traveled to each speaker's home to complete an initial screening and make an audio recording. Speakers were screened to ensure that they could read at a single word level, but no further cognitive testing was completed. Speakers also passed a hearing screening at 25 dB HL for frequencies of 0.5, 1, 2, and 4.0 KHz bilaterally (American Speech-Language-Hearing Association [47]). For the recording portion of the session, each speaker was seated at a table and a lapel microphone (Radio Shack 33-1013) was attached to their shirt collar, approximately 6 inches from their mouth. They were asked to read 53 single words from index cards (Appendix A). The investigators controlled the reading pace. If a word was mispronounced, the speaker was asked to repeat the word, if a second error occurred, the investigator read the word aloud and asked the speaker to repeat it. Word repetition was only used when a word did not match the intended target as judged by the first author, not when a distortion occurred. For example, if the target was road and the speaker said ream he was prompted to repeat the token, but if the speaker said woad the investigator would have continued without repeating the token. All data was recorded on digital audiotape (DAT: Tascam 1200). The intelligibility testing reported in the present study was part of a screening session for participation in an X-ray microbeam study of speech kinematics being conducted by the authors at the University of Wisconsin-Madison.

Acoustic data

The speech samples were digitised and stored in CSpeech (filter cutoff 9.8 KHz, sampling rate=22.1 KHz)[37]. Prior to recording, a 90dB calibration tone was recorded for use in calculating sound pressure level from the acoustic speech signal recorded. Words were then saved in individual files. These word files were used in the randomized play lists in the listening tasks (see below).

Listening tasks

Listening group I

Listeners included 10 undergraduate students enrolled in the Communicative Disorders program at the University of Wisconsin-Madison and met the following criteria: a) between the ages of 18-50 years, b) minimal experience with speakers with dysarthria and, c) able to pass a hearing screening at 25 dB HL for frequencies of 0.5, 1, 2, and 4.0 KHz bilaterally [47] . Listeners reported no previous experience with speakers who had speech or language disorders and had taken only an introductory course within the department and were therefore, considered naïve listeners. All listeners were paid for their participation. The digitised single words for individual speakers were presented through a loudspeaker to listeners seated individually in a sound treated room. The signal volume was set to a comfortable level by the examiner prior to each experimental session. Speaker order was randomized. Listeners heard all words spoken by all speakers. The order of words presented for each speaker was also randomized, but was the same across all listeners as pre-generated multiple choice answer sheets were needed to complete the listening task. Listeners were given response forms that had four words in each numbered row (one target and three foils) and were instructed to select the word in each row that most closely matched what they heard. The 50 response forms (5 speakers x 10 listeners) were scored to determine both the percent correct and the profile of feature errors according to the phonetic features used in the test construction. Participation in the listening experiment lasted approximately 25 minutes.

Listening group II

A second listening task was added based on the response of listeners in group I. All 10 listeners in group I subjectively reported that sometimes none of the foils on the multiple-choice answer sheet matched the token they heard presented. The first author then reviewed lists from all five speakers to whether an error in word order had been made when formatting the answer sheets. No errors were found. Therefore, a group of skilled listeners was asked to transcribe each token without knowledge of the target or foils. Comparison of the two scoring methods would allow investigators to look at possible influences of the test format as well as differences in the phonetic error profiles. Five skilled listeners were asked to perform a broad transcription of the same set of single words presented under the same listening conditions as with group I. Listeners met the same inclusion criteria as those in group I; in addition, they had completed a course in phonetics and had more than 4 years clinical experience. Listeners in group II were considered expert listeners. Percent correct was determined for each speaker based on the transcription results, with each target phoneme being counted as either correct or incorrect. In addition, a profile of feature errors was computed by marking the phonetic category which corresponded to any difference in the target word compared to the transcription. Using this method it was possible to have multiple feature errors within a single word. For example, if the target were bad and it was transcribed as pet, errors would be marked for the word initial voicing, high-low vowel, and word final voicing contrasts.

Phonetic error profiles

Error rates for the transcription task were calculated by recording each time a listener marked a response other than the target word and dividing by the total errors for each pair by the number of listeners (n=10). For example if the target was bad and four of the listeners chose bed an error rate of 0.4 (4/10) would be marked in the high-low vowel feature category. Errors per contrast were then added (i.e., the error rate for the 13 word pairs in the high-low vowel category) and divided by the total number of errors possible on the test (n=122). Using this method, all phonetic error categories were given equal weight; thus allowing the investigators to look at which contrasts seemed to have the greatest impact on the overall decrease in speech intelligibility. For the transcription task, error rates were calculated in a similar fashion, however multiple target phonemes could be recorded as errors within a single target word and the total number of errors possible on the test was the total number of phones produced (n=128).


Intelligibility scores

Intelligibility scores calculated for each speaker are shown in the final column of Table 1. Mean percent intelligibility is based on both the multiple-choice task (10 listeners) and the transcription task (5 listeners) as no significant difference was found between the two listening tasks (t (49) = -0.04, p = 0.97). Intelligibility scores for the current speaker group ranged from 41% to 75%. The range of scores is quite broad, with participants at the low end being more difficult to understand and participants at the high end being easier to understand. Also shown in Table I are the standard deviations computed for the individual speakers. Two pairs of speakers having very similar overall scores (DS02 and DS05 or DS01 and DS03). Even though similarities were found in overall intelligibility scores, variability in error rates across the phonetic contrasts and their rankings was found for individual speakers (see below).

Phonetic feature analysis

The feature profile for the speakers as a group is shown in Figure 1. The phonetic categories are represented on the x-axis and the frequency of the errors on the y-axis (examples can be found in Appendix A). The height of each bar shows the error proportion (a measure of incorrect transmission). The most severely affected phonetic features for the group, in rank order of severity, are as follows: initial cluster-singleton, long-short vowel, high-low vowel, initial glottal-null, fricative place, front-back vowel, stop place, final cluster-singleton, fricative-affricate, voiced-voiceless initial, stop-fricative and /r/-/w/. In terms of articulatory or voice function, these features relate to jaw-tongue posture and control for vowel and consonant production (high-low vowel, front-back vowel, fricative place, stop place, fricative-affricate, stop-fricative, cluster-singleton), and phonatory function (glottal-null, voicing contrast, long-short vowel). These results indicate that the phonetic features were not affected uniformly but rather some features were more susceptible to errors than others. Among the more resistant features in the group data were voiced-voiceless final, alveolar-palatal, initial consonant-null and final consonant-null which had no errors associated with them. Because participants with Down syndrome differed in overall intelligibility, some participants had many errors and others had very few. To show individual patterns for the feature profiles, the features were ranked by error proportion for individual participants. The results are shown in Table 2 for the six contrasts (ranked 1-6) that had the highest error rates for each of the five participants. Data show that certain features tended to occur frequently at the top ranks, although individual participants had their own patterns of feature errors.

Figure 1 | Mean proportion of errors within phonetic contrasts for the group of 5 participants based on the multiple-choice listening task

Figure 2 | Mean proportion of errors within phonetic contrasts for the group of 5 participants based on the transcription listening task

Feature DS01 DS02 DS03 DS04 DS05
Initial singleton-cluster 1 1 1 2 1
Long-short vowel 2 4 2 - 3
High-low vowel 3 3 - 1 2
Initial glottal-null - - 5 3 -
Fricative place 4 5 4 4 -
Front-back vowel - 6 3 - 6
Stop place 6 - 6 5 5
Final singleton- final cluster 5 2 - 6 4
/r/-/w/ - - 6 - -

Table 2 | Identification of the features associated with the highest six error scores for the 6 speakers based on responses for the multiple-choice task


Listeners in group II were asked to provide a broad transcription for each word they heard presented (53 tokens x 5 speakers). They did not have any knowledge of what the target word or foils were but were told that each token was an English word. Phonetic error profiles were constructed for each speaker according to the same procedures used in the multiple-choice task. The error proportions for features identified by group II is shown in Figure 2, a ranking of the most frequent phonetic contrast errors for individual participants is shown in Table 3. Features with the highest error proportion and ranking were similar for the two listening tasks. Ranking of the top errors on the transcription task included the same ones listed above plus voiced-voiceless in a word final position, which did not have any errors associated with it on the multiple-choice test. The absolute error frequencies were slightly higher in the transcription versus the multiple-choice tasks. This is likely related to the fact that multiple errors could be recorded for each word; listeners in group II recorded two phonetic errors within a single token for 63% of the total tokens transcribed. As an example, Table 4 provides a list of target tokens, multiple-choice foils, the token selected by group I and the transcription from group II for speaker DS03. Tokens included on the list include only examples where multiple errors were reported by group II compared to the single error reported by group I. Data in Table 4 suggests that the multiple-choice test format affected the phonetic error profiles for the speakers. Overall, only 46% of the tokens transcribed by listener group II matched the errors marked by the listeners in group I. The remaining errors were either different single phonetic category errors (26%) or multiple-errors within the same token (28%).

Feature DS01 DS02 DS03 DS04 DS05
Initial singleton-cluster 1 1 1 1 1
Long-short vowel 5 4 - - 6
High-low vowel 2 3 2 2 2
Initial glottal-null - - - - -
Fricative place 4 5 4 4 -
Front-back vowel - 6 3 3 3
Stop place 6 - 6 5 5
Final singleton- final cluster 3 2 5 6 4
/r/-/w/ - - - - -

Table 3 | Identification of the features associated with the highest six error scores for the 6 speakers based on responses for the multiple-choice task

Target Foils Listener group I Listener group II
geese gas-goose-guess gas /græs/ /gris/
blend bend-lend-end lend /lænd/
sue stew-two-shoe sue /skul/
shoot seat-shot-sheet sheet /sit/
sheet seat-shot-shoot cheat /tʃat/
steak snake-take-sake steak /snaɪl/
sew foe-show-toe foe /wo/
chop chap-top-shop shop /ʃat/
big pig-bag-beg bag or beg /bæk/
coat goat-cat-tote goat /goɪp/
wise rise-weights-wide wise /gris/
knew know-knee-gnaw know /no/
leak wreak-lick-league leak /wɪk/
farm arm-firm-harm farm /fɜn/
had add-pad-hid add /bæd/
hate ate-fate-hit ate /boɪt/
dish dash-ditch-did did /dæd/
pig big-pick-peg peg /pɛk/
fast fat-fist-vast fist or fat fɪt/

Table 4 | Comparison of responses on multiple-choice response from listener group I and transcription results from group II for speaker DS03

Relation between intelligibility and feature errors

To determine the relationship between overall intelligibility and the highest-ranked features, a correlation was computed between the individual intelligibility scores and the averaged individual error proportions for the six most affected features. The resulting correlation had a value of 0.88 for the multiple-choice task and 0.94 for the transcription test, indicating that the error proportions for these five features was highly correlated with the speakers overall intelligibility.


The aim of the present study was to move beyond traditional measures of speech intelligibility, which provide a metric of severity, and identify phonetic dimensions that underlie reduced intelligibility in adults with Down syndrome. A phonetic feature analysis proved to be a natural and convenient way to achieve this purpose, as features relate both to articulatory functions and to linguistic contrasts that underlie intelligibility. Results of the study demonstrate that speakers with similar single-word intelligibility scores have different speech errors contributing to that score. Although individual variability was found, certain contrasts seemed to contribute much more heavily than other contrasts to the intelligibility deficits seen in this group of speakers. The features that were most often affected are discussed in terms of phenotypic characteristics that have been identified and are believed to underlie speech production difficulties for people with Down syndrome [12 , 49,50] .

Feature errors

The results of the study demonstrate a range of intelligibility scores for adults with Down syndrome (41-75%). Individual differences in the contrasts associated with the highest error rates were found (see Tables 2 and 3), however, there was a small group of errors that were ranked highly for all five speakers. These included cluster-singleton production in the word initial and final position, vowel errors, and place of production for stops and fricatives. For cluster-singleton contrast, the target word contained a consonant cluster and listeners identified a singleton. Errors associated with the vowel contrasts (high-low and long-short) vowels were recorded in both directions (i.e. a target high vowel was identified as a low vowel and a target low vowel was identified as a high vowel). Of note, there were no errors associated with the vowel /i/ for any of the speakers. Finally, both fronting and backing errors were recorded for place of production across speakers, however, individual speakers tended to have a consistent pattern of errors. Accurate production of these phonetic contrasts requires precision in lingual posture and control, and timing. This finding was not entirely surprising given observations in the literature indicating that people with Down syndrome have a larger tongue in relation to a smaller oral cavity [51] as well as differences in muscle tone [52] . Attributing all errors associated with these contrasts solely on the basis of skeletal and muscular differences, however, may well be inadequate. Careful studies of children who have undergone partial tongue resections to reduce tongue size have reported almost no speech improvement as a result of this procedure [53,54] . A recent study of jaw stiffness reported no differences for a group of children with Down syndrome compared to age-matched peers [55] . The author concluded that muscle tone abnormities in children with Down syndrome do not affect orofacial muscles sufficiently to influence speech production. Differences in patterns of activation for the jaw muscles that may suggest a possible compensatory behavior were also reported. Human physiology has a great capacity to accommodate structural deviations, therefore, some children with Down syndrome may be able to adapt to structural anomalies while others experience significant difficulty. Miller and Leddy have hypothesised that it is primarily the neurological system that influences speech production in people with Down syndrome [12 , 24, 56] . Motor constraints could influence the precision of speech production effects and individuals ability to adapt to their unique speech structures (i.e., skeletal or muscular differences). The phonetic contrasts with high error proportions in the present study support probable impairments in motor control as a key reason for reduced speech intelligibility, however, the nature of the motor control difficulties are not clear.

The initial cluster-initial singleton had the highest error proportion for all five speakers. This error corresponds to perception of a single consonant when the target word contained a sequence of two consonants. High error rates were noted in both the word initial and word final positions, with slightly higher error rates found for the word initial position. In the word initial position, the stop was retained and the omitted sound (second in the cluster) was a /r/ or /l/. Omission of the liquid occurred even though speakers were able to produce these sounds as word-initial singletons. It is of interest, that the expert listeners frequently recorded distortions for these phones. In the word final position, the stop was maintained and the omission was a fricative; the fricative was not always the final consonant (i.e., the target fast was identified as fat). It is important to keep in mind that there were only five tokens used to represent these contrasts, therefore, it is not known whether simplification of other clusters would be produced by these speakers. Significant delays in development of consonant clusters have been reported previously for children with Down syndrome [6 , 57,58,59] . There are no detailed reports of articulation and phonology skills in adults with Down syndrome in the literature. Ingram suggested that the emergence of consonant clusters represents significant development in children's phonological analysis of the receptive vocabulary in terms of phonotactics [60] , and also likely reflects a maturation of the children's motor speech mechanism and continued anatomical development (see ref 61 for a review). Following, it can be hypothesised that there may be differences in the nervous system of adults with Down syndrome that affect speech production. These differences could be either structural-anatomic or related to development of the system.

Several additional phonetic contrasts with high error rates point to difficulties with speech motor control as they relate to tongue posture and control; this includes vowel production (high-low and front-back vowels) and place of production for consonants (stops and fricatives). Perceptual evidence of difficulties in production for these contrasts suggests that articulatory behavior is likely reduced and also uncoordinated in these speakers. Although as Adams points out, the identification of coordination problems is complicated by the lack of agreement concerning a definition of coordination [62] . The primary perceptual cue for vowel identification is believed to be the frequency range for the first and second formant [63,64,65] . Inappropriate tongue positioning and/or large ranges of jaw movement have been shown to result in a compression of the acoustic vowel space (reflecting centralised vocal articulations) in speakers with dysarthria (see refs 66,67). Similarly, place of production for stops and fricatives, which is indexed by noise energy and neighboring formant transitions [68] , has been shown to be linked to lingual posture and control (e.g., refs 69,70,71). It is noteworthy that alveolar-palatal contrast (/s/-/sh/) had no errors associated with it in the present study. This distinction is a commonly reported error in adolescents with Down syndrome[8]. Reasons for the discrepancy may relate to the limited number of target items, a single production of each target word, or may indicate that individuals with Down syndrome continue to improve their speech production during adolescence. The error profiles identified in the present study, nevertheless, suggest that motor control difficulties involving the articulators in adults with Down syndrome may contribute to reduced intelligibility. Further research efforts focused on an empirical description of these contrast errors as a way to explore and quantify possible underlying mechanisms responsible for the reduced intelligibility are of interest. A parallel research effort underway by the authors, details acoustic analyzes of the intelligibility test words and underlying production behaviors based on X-ray Microbeam data.

Evidence of a motor control impairment concomitant with language difficulties in these five speakers [35] supports an idea that motor and language domains are not modular [12] . Rather, there is evidence that speech intelligibility impairments may force speakers with Down syndrome to shorten their messages by selecting their most intelligible words and conversation partners to reduce their demands by asking direct questions that elicit only short answers. These adaptations give the impression of very limited language production skills. This suggestion is also consistent with Goffman [72] . She proposes that motor and language development are highly interactive and that both are intimately tied to the production of speech. Models of speech disorders, such as that associated with Down syndrome, which explicitly integrate the acquisition of language and motor representations in concert with one another are needed to further our understanding of the intelligibility impairments and determining the appropriate course of intervention.

Influence of the listening task

Differences in phonetic error profiles for the two listening tasks suggest that the format for responding to the items may have influenced listener responses. It is feasible that the forced choice task may have limited listeners' responses. In the multiple-choice format, there were 4 foils for each item, and while 2 of the foils represented the contrast of interest (target plus error) which varied by a single phonetic dimension, the other foils were not always balanced in the same way. This may have created potential problems for listeners. For example, one target has the following foils on the test, coat /tote/goat/code . The target word (coat) is combined with foils that vary in both initial and final voicing. A clear voiced or voiceless consonant in either place eliminates certain foils, and ultimately may effect which foil is selected. If a listener heard a token with a word final voiced stop, they were more likely to select code regardless of the status of the initial consonant. Therefore, an error associated with word initial voicing may have been missed. In addition, for the initial glottal-null contrast, it possible that the error a listener heard may not have matched any of the foils (see below). Discrepancies between what listeners heard and the choices they were given were reported subjectively by all 10 listeners who participated in the multiple-choice task.

One surprising difference between the multiple-choice task and the transcription task was differences in error rates for contrasts related to laryngeal function. The error rate for the initial glottal-null contrast was ranked within the top six based on the multiple-choice task for two speakers (DS03, DS04), but was not highly ranked based on the results of the transcription task. Differences in error proportion suggest that the error may not have been an absence of an initial glottal sound, but rather a substitution of a different consonant in the initial position. As an example, consider the target word had. Listeners using the multiple-choice format were given a choice of had - add - hid - pad ; listeners reported hearing add for 4 of 5 speakers (error proportions greater than 8/10 in all cases). For the same tokens, listeners using transcription identified a voiced word initial stop consonant rather than an initial vowel (e.g., bad). Difficulty with pitch, loudness, and voice quality has been reported previously in the literature, and it was believed that these voice features contributed to reduced intelligibility [12 , 32, 35] . While a general impression of hoarse vocal quality was informally recorded for all five speakers by the investigators; the low error rates recorded for contrasts targeting laryngeal function suggest that voice quality did not appear to have a strong impact on listeners' perception of these phonetic contrasts for speakers in the present study. The prevalence of voice production difficulties in this population and its effect on segmental variables warrants further study.

A final difference between the multiple-choice and the transcription listening tasks relates to listeners preferentially identifying vowel errors over consonant errors when two errors were present in a single target word. As an example, for speaker DS01, the target word big was identified as the foil bag but was transcribed as back. It was rare (3.8% of tokens) that listeners using transcription reported a vowel error when one had not been identified in the multiple-choice task. On the other hand, 37% of consonant errors identified based on transcription were not recorded using the multiple-choice task. Reasons for this preferential selection are not clear, however, results are consistent with a report by Oller and Eilers where they have shown that knowledge of possible phonetic features (i.e., multiple-choice foils) influences listener responses even if their cues are not present in the acoustic signal [73] .


The present study provides a unique profile of the nature of the intelligibility impairment in adults with Down syndrome. The error categories, which are based on listener responses, offer insight into which phonetic contrasts pose the most difficulty for the listener and likely contribute to reduced intelligibility. Despite this limitation, subsequent acoustic analyzes should be used to explore and quantify possible underlying mechanisms responsible for the reduced intelligibility. It is conceivable that a single contrast error across speakers may not be represented by a uniform set of acoustic parameters and underlying physiologic mechanisms. It should be noted that the results reported were based on single-word productions and cannot be generalised to conversational speech intelligibility. Differences in production related to speech task merits future study. Although similarities in phonetic error profiles were noted across speakers, individual differences support the use of individualized clinical intervention programs to improve communication skills for individuals with Down syndrome.


  1. Kent R, Weismer G, Kent J, Rosenbek J. Toward explanatory intelligibility testing in dysarthria. Journal of Speech and Hearing Disorders. 1989; 54:482-499.
  2. Chapman R, Hesketh, L. Behavioral phenotype of individuals with Down syndrome. Mental Retardation and Developmental Disability Research Reviews. 2000;6:84-95.
  3. Chapman R. Language and communication in individuals with Down syndrome. In: Abbeduto L, editor. International Review of Research in Mental Retardation: Language and Communication, vol. 27. Academic Press; 2003. p.1-34,
  4. Leddy M. The biological bases of speech in people with Down syndrome. In: Miller J, Leddy M, Leavitt L, editors. Improving the communication of people with Down syndrome. Baltimore, MD: Paul H. Brookes; 1999.
  5. Leddy M, Rosin M, Miller J. Improving the intelligibility of young children with Down syndrome. Paper presented at the annual convention of the American Speech-Language-Hearing Association, Chicago, IL; 2003.
  6. Stoel-Gammon C. Down syndrome phonology: Developmental patterns and intervention strategies. Down Syndrome Research and Practice. 2001;7:93-100.
  7. Willcox A. An investigation into non-fluency in Down's syndrome. British Journal of Disorders of Communication. 1988;23:153-170.
  8. Rosin M, Swift E, Bless D, Vetter D. Communication profiles of adolescents with Down syndrome. Journal of Childhood Communication Disorders. 1988;12:49-64.
  9. Chapman R, Schwartz S, Kay-Raining Bird A. Language skills of children and adolescents with Down syndrome: I. Comprehension. Journal of Speech and Hearing Research. 1991;34: 1106-1120.
  10. Kumin L. Intelligibility of speech in children with Down syndrome in natural settings: Parent's perspective. Perceptual and Motor Skills. 1994;78:307-313.
  11. Pueschel S, Hopman M. Speech and language abilities of children with Down syndrome. In: Kaiser A, Gray D, editors. Enhancing children's communication. Baltimore, MD: Paul H. Brookes; 1993. p.335-362.
  12. Miller J, Leddy M. Down syndrome: The impact of speech production on language development. In: Paul R, editor. Communication and language intervention series: Vol. 8. Exploring the speech-language connection. Baltimore: Paul H. Brookes Publishing Co; 1998. p.163-177.
  13. Fowler A. Language abilities in children with Down syndrome: Evidence for a specific syntactic delay. In: Cicchetti E, Beeghly M, editors. Children with Down syndrome: a developmental perspective. New York, NY: Cambridge University Press; 1990. p.302-328.
  14. Chapman R. Language development in adolescents with Down syndrome. In: Pueschel S, Sustrova M, editors. Adolescents with Down syndrome. Baltimore: Brookes Publishing Co; 1997.
  15. Chapman R, Seung H, Schwartz S, Kay-Raining Bird. Language skills of children and adolescents with Down syndrome: II Production Deficits. Journal of Speech, Language and Hearing Research. 1998;41:861-873.
  16. Dykens E, Hodapp R, Evans D. Profiles and development of adaptive behavior in children with Down syndrome. American Journal on Mental Retardation. 1994;98:580-587.
  17. Kernan K, Sabsay S. Linguistic and cognitive abilities of adults with Down syndrome and mental retardation of unknown etiology. Journal of Communication Disorders. 1996;29: 401-422.
  18. Miller JF. Individual differences in vocabulary acquisition in children with Down syndrome. In: Epstein C, Hassold T, Lott I, Nadel L, Patterson D, editors. Etiology and pathogenesis of Down syndrome: Proceedings of the international Down syndrome research conference. New York: Wiley-Liss; 1995. p.93-103.
  19. Berman S, Krivsky J. Speech intelligibility and the child with Down syndrome. Presented at the International Child Phonology Conference, Wichita, KS; 2002.
  20. Berman S, Sauer K. Speech intelligibility and the adult with Down syndrome. Presentation at the annual meeting of the American Speech-Language-Hearing Association, Chicago, IL; 2003.
  21. Leddy M, Gill G. Speech and language skills of adolescents and adults with Down syndrome: Enhancing communication. Paper presented at the National Down Syndrome Congress Annual Convention, Miami, FL; 1996.
  22. Leddy M, Gill G. (1999). Enhancing the speech and language skills of adults with Down syndrome. In: Miller J, Leddy M, Leavitt L, editors. Improving communication of people with Down syndrome. Baltimore, MD: Paul H. Brookes Ltd; 1999. p.205-213.
  23. Rosin M, Swift E. Communication intervention: Improving the speech intelligibility of children with Down syndrome. In: Miller J, Leddy M, Leavitt L, editors. Improving the communication of people with Down syndrome. Baltimore, MD: Paul H. Brookes; 1999. p.133-160.
  24. Kumin L. Speech intelligibility in individuals with Down syndrome: A framework for targeting specific factors for assessment and treatment. Down Syndrome Quarterly. 2001;6:1-8.
  25. Kumin L, Adams J. Developmental apraxia of speech and intelligibility in children with Down syndrome. Down Syndrome Quarterly. 2000;5:1-6.
  26. Farmer A, Brayton E. Speech characteristics of fluent and dysfluent Down's syndrome adults. Folia Phoniatrica. 1979;31:284-290.
  27. Van Riper C. The nature of stuttering. Upper Saddle River, NJ: Prentice-Hall; 1982.
  28. Yarter B. Speech and language programs for the Down's population. Seminars in Speech, Language, and Hearing. 1980;1:49-60.
  29. Miller J. Developmental asynchrony of language development in children with Down syndrome. In: Nadel L, editor. Psychology of Down Syndrome. MIT Press: Boston, MA; 1988. p.167-198.
  30. Borghi R. Consonant phoneme and distinctive feature error patterns in speech. In: Van Dyke D, Lang D, Heide F, Van Duyne D, Soucek M, editors. Clinical Perspectives in the Management of Down's Syndrome. Springer-Verlag, Berlin; 1990. p.147-152.
  31. Kimelman M, Swift E, Rosin P, Bless D. Spectrographic analysis of vowels in Down syndrome speech. Paper presented at the Acoustical Society of America, Nashville, TN; 1985.
  32. Swift E, Rosin M, Khdir A, Bless D. Aerodynamic properties of speech in adult males with Down syndrome. Poster presented at the annual convention of the American Speech-Language-Hearing Association, San Antonio, TX; 1992.
  33. Hesselwood B, Bray M, Crookston I. Juncture, rhythm, and planning in the speech of adults with Down's syndrome. Clinical Linguistics and Phonetics. 1995;9:121-137.
  34. Shriberg L, Widder C. Speech and prosody characteristics of adults with mental retardation. Journal of Speech and Hearing Research. 1990;33: 627-653.
  35. Leddy M. The relations among select vocal function characteristics among adult males with Down syndrome. Unpublished doctoral dissertation, University of Wisconsin-Madison; 1996.
  36. Dodd B, Thompson L. Speech disorder in children with Down's syndrome. Journal of Intellectual Disability Research. 2001;45:308-316.
  37. Bradford A, Dodd B. Do all speech disordered children have motor deficits? Clinical Linguistics and Phonetics. 1996;10:77-101.
  38. Kumin L. Speech intelligibility and childhood verbal apraxia in children with Down syndrome. Down Syndrome Research and Practice. 2006;10:10-22.
  39. Subtelny J. Assessment of speech with implications for training. In: Bess F, editor. Childhood deafness. New York: Grune and Stratton; 1977. p.183-194.
  40. Bunton K. An acoustic item-analysis of intelligibility. Unpublished doctoral dissertation at the University of Wisconsin-Madison; 1999.
  41. Bunton K, Weismer G. The relationship between perception and acoustics for a high-low vowel contrast produced by speakers with dysarthria. Journal of Speech, Language, and Hearing Research. 2001;44:1215-1228.
  42. Bunton K, Weismer G. Segmental level analysis of laryngeal function in persons with motor speech disorders. Folia Phoniatrica et Logopaedica. 2002;54:223-239.
  43. Kent RD, Kent JF, Weismer G, Sufit R, Rosenbek JC, Martin RE, Brooks BR. Impairment of speech intelligibility in men with amyotrophic lateral sclerosis. Journal of Speech and Hearing Disorders. 1990;55:721-728.
  44. Kent JF, Kent RD, Rosenbek J, Weismer G, Martin R, Sufi, R, Brooks BR. Quantitative description of the dysarthria in women with amyotrophic lateral sclerosis. Journal of Speech and Hearing Research. 1992;35:723-733.
  45. Kent R, Kim H-H, Weismer G, Kent JF, Rosenbek J, Brooks BR, Workinger M. Laryngeal dysfunction in Neurological Disease: Amyotrophic lateral sclerosis, Parkinson disease, and stroke. Journal of Medical Speech-Language Pathology. 1994;2:157-175.
  46. Sparrow S, Balla D, Cicchetti D. Vineland Adaptive Behavior Scales. Bloomington, MN: Pearson Assessments; 1985.
  47. American -Speech-Language-Hearing Association Panel on Audiologic Assessment. Guidelines for Audiological Screening. Rockville, MD; 1997.
  48. Milenkovic P. TF32 [computer program]. University of Wisconsin-Madison, Department of Electrical and Computer Engineering; 2002.
  49. Pueschel S. Clinical aspects of Down syndrome from infancy to adulthood. American Journal of Medical Genetics. 1990;7:52-56.
  50. Pueschel S. The person with Down's syndrome: Medical concerns and educational strategies. In: Lott I, McCoy E, editors. Down syndrome: Advances in medical care. New York: Wiley-Liss; 1992. p.53-60.
  51. Ardran G, Harker P, Kemp F. Tongue size in Down syndrome. Journal of Mental Deficiency Research. 1972;16:160-166.
  52. Henderson S. Motor skill development. In: Lane D, Stratford B, editors. Current approaches to Down syndrome. Westport, CT: Praeger; 1985. p.187-218.
  53. Katz S, Kravitz S. Facial plastic surgery for persons with Down syndrome: Research findings and their professional and social implications. American Journal on Mental Retardation. 1989;94:101-110.
  54. Parsons C, Iacono T, Rozner L. Effect of tongue reduction on articulation in children with Down syndrome. American Journal of Mental Deficiency. 1987;91:328-332.
  55. Connaghan K. Jaw stiffness during speech in children with suspected hypo- and hypertonia. Unpublished doctoral dissertation, University of Washington; 2004.
  56. Miller J, Stoel-Gammon C, Leddy M, Lynch M, Miolo G. Factors limiting speech development in children with Down syndrome. Miniseminar presented at the annual convention of the American Speech-Language-Hearing Association, San Antonio, TX; 1992.
  57. Stoel-Gammon C. Phonological analysis of four Down's syndrome children. Applied Psycholinguistics.1980;1:31-48.
  58. Dodd B, Leahy P. Phonological disorders and mental handicap. In: Beveridge M, Conti-Ramsden G, Leudar I, editors. Language and Communication in Mentally Handicapped People. London: Chapman and Hall; 1989. p.33-56.
  59. Rosenberg S, Abbeduto, L. Language and communication in mental retardation: Development, process and intervention. Hillsdale, NJ: Erlbaum; 1993.
  60. Ingram D. Toward a theory of phonological acquisition. In Miller J, editor. Research on child language disorders: A decade of progress. Austin, Tx: Pro-Ed; 1991. p.55-72.
  61. McLeod S, van Doorn J, Reed V. Normal acquisition of consonant clusters. American Journal of Speech-Language Pathology. 2001;10:99-110.
  62. Adams SG. Rate and clarity of speech: An x-ray microbeam study. Unpublished doctoral dissertation, University of Wisconsin-Madison; 1990.
  63. Peterson G, Barney H. Control methods used in a study of the vowels. The Journal of the Acoustical Society of America. 1952;32:693-703.
  64. Ladefoged P. A course in phonetics. New York: Harcourt Brace Javanovich; 1975.
  65. Kent R, Reed C. The acoustic analysis of speech. San Diego, CA: Singular Publishing; 1992.
  66. Weismer G, Martin R. Acoustic and perceptual approaches to the study of intelligibility. In: Kent R, editor. Intelligibility in Speech Disorders: Theory, Measurement, and management. John Benjamins: Amsterdam; 1992. p.68-118.
  67. Weismer G. Motor Speech Disorders. In: Hardcastle WJ, Laver J, editors. The Handbook of Phonetic Sciences. Cambridge, MA: Blackwell Publishers Ltd; 1997. p.191-219.
  68. Stevens K. Articulatory-Acoustic-Auditory Relationships. In: Hardcastle WJ, Laver J, editors. The Handbook of Phonetic Sciences. Cambridge, MA: Blackwell Publishers Ltd; 1997. p.191-219.
  69. Kent R, Netsell R, Bauer L. Cineradiographic assessment of articulatory mobility in the dysarthrias. Journal of Speech and Hearing Disorders. 1975;40:467-480.
  70. Kent R, Netsell R. Articulatory abnormalities in athetoid cerebral palsy. Journal of Speech and Hearing Disorders. 1978;43:353-373.
  71. Tjaden K, Turner G. Spectral properties of fricatives in ALS. Journal of Speech, Language, and Hearing Research. 1987;40:1358-1372.
  72. Goffman L. An integrative model of language and motor contributions to phonological development. In: Kamhi A, Pollock K, editors. Phonological Disorders in Children. Baltimore, MD: Paul H. Brookes Publishing Co; 2005. p.51-64.
  73. Oller D, Eilers R. Phonetic expectation and transcription validity. Phonetica. 1975;31:287-304.


This work was funded by NIH R01 DC00319, R01 DC003723 and R03 DC005902. The views expressed in this paper do not necessarily represent the views of the National Science Foundation or the United States.

Received: 10 April 2008; Accepted: 17 April 2008; Published online: August 2009

Appendix A: Target words (listed first) and corresponding error foils presented on the multiple-choice test listed by phonetic category

Phonetic Contrast Tokens
Front-back vowel pill-pull, shoot-sheet, geese-goose, food-feed, chop-chap, food-fad, knew-knee
High-Low Vowel shoot-shot, geese-gas, geese-guess, hid-had, pig-peg, dish-dash, pill-pal, hate-hit, road-rid, won-win, sheet-shot, big-bag, big-beg, knew-gnaw
Long-Short Vowel leak-lick, ship-sheep, fast-fist, hid-heed, school-skill, dog-dig, box-beaks
Voiced-Voiceless Initial consonant sip-zip, cash-gash, pig-big, pill-bill, pig-big, coat-goat, coat-tote, road-wrote
Voiced-Voiceless Final consonant food-foot, leak-league, side-sight, ate-aid, hid-hit, coat-code, had-hat
Alveolar-Palatal fricative sip-ship, sheet-seat, sew-show, shoot-suit, sell-shell
Place Stops bug-dug, cake-take, grow-crow, bill-dill, box-docks, pig-pick, door-pour, dog-bog, tip-tick, ate-ape
Place Fricatives fast-vast, sew-foe, seed-feed, sell-fell, farm-harm
Fricative-Affricate chop-shop, cash-catch, ship-chip, dish-ditch, farm-charm
Stop-Fricative sip-tip, dish-did, sew-toe, cash-cat, seed-feed, tip-ship, side-tied, hate-hash
Stop-Affricate chop-top, much-muck, cheer-tear, door-chore, tip-chip
Stop-Nasal door-more, knot-dot, bill-mill, had-ham, beat-meat, side-sign
Initial glottal-Null hate-ate, had-add, hail-ale, hand-and
Initial Consonant-Null ate-fate, blend-end, row-owe, lend-end, farm-arm, wise-eyes
Final Consonant-Null side-sigh, seed-see, meat-me, feed-fee, leak-lee
Initial Cluster-Initial Singleton grow-row, blend-lend, steak-take, steak-sake, blend-bend, slip-sip, slip-lip
Final Cluster-Initial Singleton seed-seeds, wax-whack, fast-fat, cake-cakes, leak-leaks
/r/-/l/ road-load, leak-reek, row-low, hail-hair, grow-glow
/r/-/w/ won-run, wise-rise, row-woe, wax-racks, read-weed