Articulatory features of phonemes pattern to iconic
meanings: Evidence from cross-linguistic ideophones
Arthur Lewis Thompson
, Thomas Van Hoey
, Youngah Do
Iconic words are supposed to exhibit imitative relationships between their linguistic forms
and their referents. Many studies have worked to pinpoint sound-to-meaning correspondences
for ideophones from different languages. The correspondence patterns show similarities across
languages, but what makes such language-specific correspondences universal, as iconicity
claims to be, remains unclear. This could be due to a lack of consensus on how to describe and
test the perceptuo-motor affordances that make an iconic word feel imitative to speakers. We
created and analysed a database of 1,860 ideophones across 13 languages, and found that 7
articulatory features, physiologically accessible to all spoken language users, pattern according
to semantic features of ideophones. Our findings pave the way for future research to utilize
articulatory properties as a means to test and explain how iconicity is encoded in spoken
language. The perspective taken here fits in with ongoing research of embodiment, motivation,
and iconicity research, three major strands of research within Cognitive Linguistics. The results
support that there is a degree of unity between the concepts of imitative communication and
the spoken forms of through cross-domain mappings, which involve physical articulatory
Keywords: iconicity, ideophones, articulation, phonology, phonosemantics
Declaration of interest: none.
Iconicity in spoken language is an imitative mapping or relationship between a linguistic
form to its meaning (Hinton et al. 1994; Emmorey 2014). One fundamental example of
iconicity in spoken language is onomatopoeia, as in the English woof woof for the sound of a
dog bark or vroom vroom for the sound of revving a car engine. An implicit assumption behind
iconicity in spoken language is that phonemes are associated to specific units of meaning,
acting as imitative scaffolding that comes together to form a meaningful structure (see Figure
1 for Japanese). For example, the /ŋ/ in English /diŋ.doŋ/ seems characteristic of the
reverberating echo of a bell tolling, while the alternating /i/ and /o/ seems characteristic of a
Department of Linguistics, the University of Hong Kong (email@example.com)
Department of Linguistics, the University of Hong Kong (firstname.lastname@example.org)
Department of Linguistics, the University of Hong Kong (email@example.com)
perceived movement or perceived fluctuation in pitch as the bell tolls. While various studies
have worked to list phonemic sound-to-meaning correspondences for a given language
(McCune 1983; Maduka 1988; Oswalt 1994; Hamano 1998; 2019; Ofori 2009; Assaneo et al.
2011; Akita et al. 2013; Ayalew 2013; Kwon and Round 2015; Blasi et al. 2016; De Carolis et
al. 2017; Strickland et al. 2017; Aryani 2018; Kawahara et al. 2018), it is unclear why or how
such correspondences exist in the first place.
To understand why these sound-meaning correspondences exist, we need to ask: what
properties make speech sounds imitative? Answering this question would allow linguistics and
cognitive scientists to move toward a more unified understanding of what in the spoken
modality should be classified as “iconic” and why. Speech sounds consist of both aural
(acoustic) and kinetic (articulatory) properties. Without wanting to discount the importance of
acoustics, this paper opts to examine articulatory, and therefore gestural or movement-based,
properties of ideophones—words known to be imitative or depictive (Dingemanse 2012; 2019;
Akita and Dingemanse 2019).
Ideophones are marked words which depict sensory imagery (Dingemanse 2012) and
belong to an open lexical class because speakers are known to improvise them on the spot
(Dingemanse 2019). Ideophones include onomatopoeia but further span a range of imitative
meanings beyond just that of SOUND (Dingemanse 2012: 663), e.g., MOTION in kamúkamú
‘countermovement of buttocks while walking’ of Pichi (Yakpo 2019), COGNITIVE STATES
ŋẽʔŋẽʔ ‘manner of being baffled or dazed’ of Chaoyang (Zhang 2016), or OTHER SENSORY
PERCEPTIONS and chun ‘complete absence of sound’ of Pastaza Quichua (Nuckolls and
Swanson 2019). Recent studies have likened ideophones to gestures made with the mouth,
given their synchrony with iconic hand gestures in natural speech (Nuckolls 2000; Dingemanse
2013; 2015; Mihas 2013; Hatton 2016). In her fieldwork, Hatton (2016: 47) noted that speakers
consistently executed gestures depicting something and simultaneously said ideophones which
depictively corresponding to those gestures. Ideophones as “oral gestures” is a notion that
highlights the importance of articulatory movement in our pursuit of understanding just how
ideophones mean what they mean. Ideophones have been shown to be easily learnable by
speakers from different language backgrounds, which may speak to their gesturally imitative
nature—where meaning is encoded and perceivable despite obvious differences between
languages, such as phonotactics, phonological inventory, or lexical associations (Iwasaki
2007a; 2007b; Dingemanse et al. 2016; Lockwood et al. 2016;). If we understand how
movement or gesture is meaningful in the context of ideophones then we should be able to
know (1) why sound-meaning correspondences exist, and (2) what properties make speech
sound imitative. Ideophones are an ideal testing ground for how articulatory properties, i.e.,
mouth movements, pattern to meaning.
Vocal imitations and onomatopoeia created spontaneously by participants in experimental
settings (Assaneo et al. 2011; Perlman et al. 2015; Lemaitre et al. 2016; Perlman and Lupyan
2018; Taitz et al. 2018), although improvised and therefore not lexical, have been shown to
exhibit consistent sound-meaning correspondences. These correspondences can also be
attributed to patterns in articulation (Assaneo et al. 2011; Taitz et al. 2018), reinforcing our
investigative focus on the articulatory properties of phonemes in ideophones because contrasts
among them are realized with varying articulatory parameters (see Section 3.3).
In a methodological vein similar to Blasi et al. (2016), our study looks at whether
articulatory features of consonants (e.g., occlusion of airflow, sibilant airflow, nasality) and
vowels (e.g., high and back tongue positions, rounding of the lips) are more or less attested in
certain semantic domains of ideophones (e.g., telic events, human vocal sounds, motion,
If an articulatory gesture is more attested in one semantic domain of ideophones
than another, this could explain why some sound-meaning correspondences might be perceived
as imitative and therefore iconic of a given percept. Such correspondences would therefore be
explainable as perceptuo-motor affordances grounded in gestural means, e.g., the total closure
of plosive articulation, affords the semantic category of telic events and their percept “coming
to an abrupt stop.” We created a database of ideophones from 13 languages (in total, 1860
ideophones) to carry out our investigation on how articulatory properties of phonemes pattern
with ideophone meaning.
2.1 Phonosemantics: the study of sound-meaning correspondences
The subfield of phonosemantics subscribes to a broad hypothesis that “every phoneme is
meaning-bearing,” in a word and that that meaning “is rooted in its articulation” (Diffloth 1972,
1979; Hamano 1998; see Dingemanse 2018 for review). Phonemic sound-meaning
correspondences, henceforth phonosemantic mappings, have been proposed for a number of
languages (Maduka 1988; Waugh 1994; Hamano 1998; Oswalt 1994; Assaneo et al. 2011;
Akita et al. 2013; Ayalew 2013; Kwon and Round 2015; Blasi et al. 2016). For example, the
appearance of /p, b/ in ideophones to do with explosions, expectoration, or releases of pressure
is explained through the articulatory properties of /p, b/ themselves (a blockage of airflow then
followed by a release) which gesturally resemble those meanings. In the present study, even
though we do not assume that absolutely all phonemes are necessarily meaning -bearing in all
contexts, we do subscribe to the notion that the phonosemantic mappings of ideophones should
be “rooted in its articulation,” following previous studies (Diffloth 1994; Oda 2000; Strickland
et al. 2017; Taitz et al. 2018).
Figure 1 illustrates how Hamano (1998: 40) assigned phonosemantic mappings to the
CVCV root structure of Japanese ideophones. The tier structure (upper box of Figure 1)
illustrates the broader categories of meaning in the CVCV context, i.e., if a phoneme is in X
Unlike Blasi et al. (2016), we do not base our analysis on a cross-linguistic set of words resembling a
Swadesh list but, instead, focus on ideophones, i.e., words which are perceived as imitative in nature. The semantic
domains in our study follow descriptive and theoretical work on ideophone meaning (Diffloth 1972; 1979;
Dingemanse 2012; Hamano 1998; Van Hoey 2018; Nuckolls et al. 2017).
position it depicts a Y kind of percept (Talmy 2000). The lower box of Figure 1 shows what
each phoneme specifically means in the Japanese ideophone poka-poka ‘a dull, hollow sound.’
Figure 1: Hamano’s (1998:40) phonosemantic analysis of Japanese ideophones
exemplified by pokapoka ‘a dull, hollow sound’.
Although Hamano’s (1998) analysis of Japanese ideophones draws conclusions which have
since been disputed by some (Haiman 2018: 121-122) and revised by others (Akita et al. 2013;
Nasu 2015), the basic principle remains the same for phonosemantic analyses from other
languages: each phoneme depicts an iconic percept. Hamano’s (1998) tier-based analysis,
though exemplary, is designed with the strict CV phonotactics of Japanese in mind and is not
applicable to other languages.
While intuitive yet cursory phonosemantic analyses are found
throughout the language-specific chapters of Sound Symbolism (Hinton et al. 1994) and
Ideophones (Voeltz and Kilian-Hatz 2001), these often verge on impressionistic and suffer
from a lack of cross-linguistic comparison. This issue can be mitigated by investigating cross-
linguistic ideophone systems for their mappings between phonology and semantics. This is the
route we intend to take in this study.
3 Cross-linguistic ideophone database
Though language-particular databases for ideophones are becoming more widespread, e.g.,
the Chinese Ideophone Database (Van Hoey and Thompson 2020), the Quechua Real Words
project (Nuckolls et al. 2017) or the Multimedia Encyclopedia of Japanese Mimetics (Akita
2016), there is currently no cross-linguistic database dedicated solely to ideophone inventories.
We created a database of 13 languages which were selected with the aim of being as
The specific criticism of Hamano’s analysis being that she did not use enough minimal pairs to support the
analysis illustrated in Figure 1 (Haiman 2018:121-122).
typologically diverse as possible (see Figure 1) despite the limited number of linguistic
descriptions for ideophone inventories in the world. Our major criterion for selecting languages
is that 40 or more ideophones were reported per source. Number of ideophones per language
and language family are reported in Table 1. Due to their depictive nature, and the various
methods of collection (fieldwork elicitation, dictionaries), the ideophone inventory numbers
reported in Table 1 are not absolute, but instead reflect a general picture about the semantic
“visibility” of ideophones per language. This is in line with a claim recently put forth by
Dingemanse (2019) that ideophones form an open class, speaking to the creative potential of
speakers to coin new ideophones. The languages in our database are as follows: Manyika Shona
(Franck 2014), Uyghur (Wang and Tang 2014), Manchu (Xiao 2015), Chaoyang Southern Min
(Zhang 2016), Ma’ai Zhuang (in prep),
Kam Dong (Gerner 2005), Akan (Ofori 2009), Kisi
(Childs 1988), Kuhane (Mathangwane and Ndana 2014), Pastaza Quichua (Nuckolls et al.
2017), Upper Necaxa Totonac (Beck 2008), Temne (Kanu 2008), and Yakkha (Schakow 2016).
They are alternatively presented in Figure 2 according to geographic distribution.
Figure 2: Geographic representation of the languages in the database
Table 1: Languages, number of ideophones, and language families in the database
Language name [glottocode]
Akan Twi [akan1250]
Kuhane / Mbalangwe [subi1246]
Manyika Shona [shon1251]
Temne / Themne [timn1235]
Chaoyang Southern Min [chao1238]
Ma’ai Zhuang ideophones have been collected during ongoing fieldwork. A full list is available in OSF
repository, which holds the supplementary materials (https://osf.io/6bhz8/)
Ma’ai Zhuang / Langjia Buyang
Kam / Southern Dong [kamm1249]
Pastaza Quichua [nort2973]
Upper Necaxa Totonac [uppe1275]
3.2 Semantic features
Definitions per ideophone were entered into the database according to how they were
described in the source documentation. Minor stylistic changes in the wording of some
definitions were made to synthesize them across languages, e.g., “sound of a dog’s bark” and
“sound of barking dog” were entered as “sound of barking dog” for consistency. These subtle
differences in the word choice of the source documentation were interpreted as a product of
English syntax rather than of ideophone meaning itself. If an ideophone was reported with
multiple definitions, e.g., “sound of flowing water; peeing”, each was entered separately into
the database, i.e., once for “sound of flowing water”, and once for “peeing”, in line with
strategies set forth in the Cross-Linguistic Data Format paradigm (Forkel et al. 2018).
Definitions for reduplicated forms were entered into the database only if they were
described differently from non-reduplicated equivalents. Each definition was coded with
semantic features created in correspondence with Dingemanse’s (2012: 663) implicational
hierarchy of ideophones (see Akita 2009; McLean 2020; Van Hoey in print for alternative
approaches). Dingemanse’s (2012) hierarchy begins with monomodal depiction of sound as its
most fundamental category and goes on to include four other cross-modal semantic categories:
SOUND < MOVEMENT < VISUAL PATTERNS < OTHER SENSORY PERCEPTIONS < COGNITIVE STATES.
For each of these categories, 9 binary semantic features (Table 2) were created based on cross-
linguistic ideophone research on the observations of what ideophones depict across languages
(Hamano 1998; Hinton et al. 1994; Nuckolls et al. 2016; Van Hoey 2018). It is important to
note that these semantic features are not mutually exclusive. An ideophone may be coded for
multiple, seeing as most ideophones are multisensory (Nuckolls 2019; McLean 2020). For
example, the Chaoyang /hu.hu/ ‘wind blowing’ was coded with [+sound] (because this
ideophone depicts an auditory percept), [-telic] (because this ideophone does not involve a
perceived endpoint of an event), [+wind] (because this ideophone involves a percept created
by the movement of air), and [-motion] (because this ideophone is not depictive of a motion
plus a resulting state or manner).
Table 2: Semantic features used to code ideophones
Description of positive value [+]
depicts auditory information (“the sound of X”)
auditory information of inherently high amplitude, i.e.,
explosion, screaming, shattering
vocalization made by people, i.e., laughter, crying, talking
vocalization made by animals
depicts active (“the act of X”) movement, i.e., walking,
chopping, splashing, sneaking, flapping, water boiling,
bumping, spitting, firecrackers exploding
depicts movement of air, bodily or otherwise, i.e., blowing,
depicts visual information, i.e., how something looks or
degrees of visibility
depicts rubbing together or rough contact of surfaces (not
necessarily active movement), i.e., grinding, rustling,
sharpening, hacking up phlegm, tearing cloth
depicts an event which reaches completion
While the assignment of semantic features based solely on textual documentation is a far
from perfect methodology when it comes to capturing the subtle nuances, multisensory
percepts, and contextual meaning variations of ideophones, it is not yet clear what sort of
methodology, and what extent of native speaker input, would even render such a goal possible.
Hand gestures have been shown to be an insightful tool when it comes to eliciting semantic
percepts of ideophones which native speakers may find too subtle to verbalize (Dingemanse
2015). There also are at least three ideophone dictionaries which make use of visual
information to explain the meanings of ideophones (Akita 2016; Gomi 1989; Nuckolls et al.
2017). However, since this calibre of detailed documentation is only available for two
languages so far, we are forced to contend with textual definitions for (1) determining semantic
features, and (2) sentence examples (if provided) for any basis of contextual meaning variation.
Therefore, our assignment of semantic features is based on inference of semantic properties
which are inherent to or implied by the definitions provided in their documentation. Examples
from Kuhane are given in Table 3.
Crucially, if a percept was not explicitly stated then it was not reflected in our assignment
of semantic features. For example, one could imagine that the Kuhane ideophone /gwa/ ‘sound
of entering abruptly’ depicts a kind of visual information, rendering it [+appearance]. However,
because Mathangwane and Ndana (2014) did not specify anything in their definition of /gwa/
about visual information, our semantic coding was thus ‘sound’ = [+sound], ‘of entering’
[+motion], and ‘abruptly’ [+telic]. Likewise, /tʃevutʃevu/ ‘looking around continuously’ was
coded as ‘looking’ [+appearance], ‘looking around’ [+motion], and ‘continuously’ [-telic]. A
more challenging example is /tʃootʃoo/ ‘whispering,’ this was coded as [+human] since it is an
action done by humans, [+sound] because it is an auditory percept, and [+wind] because it
involves a depiction of (sibilant) air movement. However, it was not obvious whether to code
‘whispering’ as [+motion] or [-motion]. Is ‘whispering’ an active and discriminable form of
movement, comparable to that of ‘chopping’ or ‘splashing’ or is ‘whispering’ simply an
auditory percept? If a semantic feature was called into question, we chose to err on the side of
caution and thus refrained from coding that feature, i.e., the feature value assigned was negative
A native English speaker coded the entire database and two other native English speakers
who did not know the purpose of the study, checked whether they (a) fully agree, (b) maybe
agree, or (c) disagree with the coding. The agreement rate between the three raters (the first
author and the two independent raters) was 0.73, and Gwet’s AC1 was 0.80, which indicates
quite high reliability.
We filtered out 14 items out of 1,874 ideophones, where both raters
disagreed with the assigned features. We would like to restate that the purpose of this study is
not to provide a detailed semantic analysis of the remaining 1,860 ideophones. Rather, what
we strive for is to determine whether general properties of ideophone meanings pattern
according to articulatory properties of phonemes.
Table 3: Examples of Kuhane ideophones coded with semantic features
‘sound of entering
‘simmering of a pot’
‘sound of splashing’
‘sound of a
‘sound of a bird’
‘treading on rotten
‘sound of thunder’
Fleiss’s κ was -0.00632, which is quite low. This is presumably due to the complex nature of the rating task,
i.e., judging the multisensoriality of a large number of definitional paraphrases. As argued by Hoek and Scholman
(2017), Gwet’s AC1 (2002) might be a better measure for interrater agreement in linguistics.
3.3 Articulatory features
All consonants were coded using 7 binary features, listed in Table 4, according to how lips,
tongue, and airflow, are involved in their articulation. Our coding follows a linear order of
phonemes. Just as the semantic features are based on cross-linguistic observations of
ideophones, our articulatory features are also empirically driven by previous ideophone
research in that they have been shown to create lexically contrastive meanings for ideophone
inventories across different languages (Diffloth 1972; 1979; Hamano 1998; 2019; Oswalt 1994;
Strickland et al. 2017; Li 2007; Thompson and Do 2019). It is important to note the articulatory
features here are different from traditional phonological features, like those of Chomsky and
Halle’s SPE (1968) or Clements’ feature geometry (1985). Our features illustrate contrastive
movements required for a phoneme to be realized but not for a phoneme to be differentiated
from other phonemes per se. For this reason, phonemes like /d/ and /l/ may be assigned an
identical set of feature values. Our features referring to oral contact, tongue resting, and tongue
root were designed to account for manner without reference to place of articulation. The reason
for having [+/- tongue resting] as a feature, as opposed to [+/- tongue movement], was to
acknowledge that tongue body and tongue tip movement may occur in some articulations but
not as an active or direct result of articulation. In these cases, the tongue assumes an inactive
position which may vary slightly according to the sound being made (Gick et al. 2004). See the
accompanying OSF repository for a full list of phonemes coded with their articulatory
Table 4: Articulatory features used to code the consonants of ideophones
Description of positive value [+]
active movement of the lips
/p, b, …/
tongue body and tongue tip are not actively
involved in articulation
/p, b, h, ʔ, …/
[+/- tongue root]
usage of back of tongue (dorsum), as with velars
/j, k, g, ŋ …/
air is forced out through a narrow channel in the
mouth, as with fricatives
/f, v, s, z, …/
velum is lowered and air escapes through the nasal
passage, as with nasals
/m, n, …/
[+/- oral contact]
active contact made either by tongue or lips
/p, b, t, d, …/
[+/- vocal folds]
movement of the vocal folds, as with modal
/b, d, n, r, …/
The binary nature of our 7 features means 14 possible feature values overall. If properties
of iconicity are truly universal, then we predict that the universally accessible properties
captured by our articulatory features should bear the explanatory power for what perceptuo-
motor affordances underpin iconicity and its notions of (analogical) depiction. While some
feature values can subsume others, i.e., [+/- oral contact] subsumes [+/- labial], the decision to
The OSF repository can be found here: https://osf.io/6bhz8/
test the subsumable [+/- labial] is again to do with lexical contrasts observed. For example, in
Chaoyang we have [+labial] [+oral contact] /pu.pu/ meaning ‘rapid movement’ and [-labial]
[+oral contact] /tsu.tsu/ ‘whispering.’ Likewise, in Pastaza Quichua we have [+labial] [+oral
contact] /pɑw/ ‘manner of being turned downward’ and [-labial] [+oral contact] /kɑw/ ‘sound
of stepping on dry leaves.’ We also include the subsumable feature [+tongue root], again, given
its ability to create lexical contrasts. For example, in Akan Twi we have [+tongue root] [+oral
contact] /kuu/ ‘call of a large bird’ and [-tongue root] [+oral contact] /tuu/ ‘manner of hitting
with the fist.’ The reason we created these subsuming features was so that the general manner
of the consonant is accounted for regardless of its place of articulation in the oral tract.
The vowels attested in ideophones were coded using 5 binary features, listed in Table 5,
according to the position of the tongue relative to the extremities of the oral cavity and whether
lip rounding is involved in articulation.
Table 5: Articulatory features used to code the vowels of ideophones
Description of positive value [+]
tongue positioned higher in the oral
cavity; jaw more closed
/I, y, ɪ, ʏ…/
tongue positioned lower in the oral
cavity; jaw more open
/a, ɶ, ɑ, ɒ…/
tongue positioned toward the front of the
/ e, ø, ɛ, œ…/
tongue positioned toward the back of the
/ɯ, u, ɤ, o…/
lips are rounded
/ y, ʏ, ø, œ…/
We have 4 specific predictions about the articulatory-semantic feature relations of
consonants based on observations from the phonosemantic literature. These observations are
grounded in perceptuo-motor analogy but have yet to be tested for ideophone inventories across
languages. (1) Fricatives, i.e., [+airflow], have been associated to wind or friction between two
objects (Oswalt 1994; Ofori 2009; Taitz et al. 2018). Improvised vocal imitations have
suggested that (2) consonants involving lip movement, i.e., [+labial], are associated with the
sounds resulting from motion, i.e., [+motion], (3) while dorsal consonants, i.e., [+tongue root],
are associated with movement itself (Taitz et al. 2018) i.e., [+motion] in our feature set. (4)
Stop consonants, characterized by total occlusion of airflow, i.e., [-airflow], have been
observed for ideophones indicating complete, i.e., [+telic], events or events with abrupt endings
(Alpher 1994; Strickland et al. 2017; Taitz et al. 2018). The analysis below aims to inspect
these predictions but also go beyond them.
The four predictions:
(1) a. [+airflow] is associated with [+wind]
b. [+airflow] is associated with [+friction]
(2) [+labial] is associated with [+motion]
(3) [+tongue root] is associated with [+motion]
(4) [-airflow] is associated with [+telic]
Although we have included vowels in our analysis, this was done out of phonological
completeness rather than any predictions regarding phonosemantic mappings. But given the
gestural properties of vowels, there are two predictions to be made following what is posited
for consonants above. We predicted that fricatives, i.e., [+airflow] consonants, will correspond
to the positive semantic features [+wind] and/or [+friction]. High vowels, like fricatives, are
also characterized by a narrowed opening in the oral cavity. Therefore, we predict that [+high]
vowels are also associated with [+wind] and/or [+friction]. Likewise, since lip rounding
involves lip movement, we predict that [+round] vowels are associated with [+motion], as we
have predicted for [+labial] consonants above. Finally, we predict that [+low] vowels are
associated with [+loud] because, according to the Sonority Hierarchy (Clements 1990), low
vowels are theoretically loudest of all vowels. That being said, we caveat these predictions with
the observation that for some minimal pairs the difference of vowels does little to contrast the
ideophone meaning, e.g., English /bæm/ [-front] vs. /bum/ [+front] where both ideophones are
arguably interchangeable in that both depict the slamming (of doors) or bursting/explosion.
Additionally, vowel alternation in reduplicated ideophones, e.g., English /splɪʃ.splæʃ/
‘splashing,’ is understood as characteristic of a fluctuation or rhythmic movement in the event
being depicted (see Hinton et al. 1994; Voeltz and Kilian-Hatz 2001). It would seem then that
consonants (/spl_ʃ/) provide a depictive frame for the event in question, while vowels add a
kind of auxiliary information such as pitch or intensity, e.g., /splɪʃ/ ‘small impact on and into
liquid,’ /splæʃ/ ‘impact on and into liquid,’/spluʃ/ ‘large impact on and into liquid.’ With such
caveats in mind, we cannot be sure whether our aforementioned predictions regarding vowels
will be empirically supported.
5.1 Collostructional methodology
As outlined above, the goal of this study is to check if meaningful relations, such as the four
predictions made above, exist between articulatory and semantic features, and why this is so.
We operationalize this by using the collostructional framework (Stefanowitsch and Gries 2003;
Gries 2019), in order to investigate the correlations between form (articulatory features) and
meaning (semantic features). The fundamental idea of collostructional approaches consists of
tallying co-occurrences of features to be investigated and placing them in a contingency table.
Let us first illustrate this framework with an application from construction grammar.
Suppose we wanted to know which noun is most likely to occur in the English construction
[N waiting to happen] (Stefanowitsch and Gries 2003). One could start by collecting corpus
data to gather all token frequencies of nouns in this position, which would provide some
measure of insight into the usage. However, it is more informative to find out that there is some
degree of special attraction between the N and the constructional slot in [N waiting to happen].
This requires a method to inspect the relative strength between N and the constructional slot.
One then can proceed to make contingency tables for all nouns that are identified in that
The nouns found for this construction by Stefanowitsch and Gries (2003), together with
their token frequency in brackets are: accident (14), disaster (12), welcome (1), earthquake (1),
invasion (1), recovery (1), revolution (1), crisis (1), dream (1), it (sex) (1), and event (1). As
shown in Table 6, this means that accident occurs 14 times in this construction (cell a), while
21 times there is another word that occurs in it (cell b). The token accident occurs 8,606 times
in other constructions (cell c), while logically there are 10,197,659 constructional contexts that
do not feature accident (cell d).
Table 6: Crosstabulation of accident and the [N waiting to happen] construction
(Stefanowitsch and Gries 2003:219)
[N waiting to happen]
- [N waiting to happen]
The next step in the collostructional approach is to use an association measure to calculate
the relative strength between the Ns and the construction. While Gries and Stefanowitsch
initially adopted the Fischer-Yates Exact test (Stefanowitsch and Gries 2003; Gries and
Stefanowitsch 2004a; Gries and Stefanowitsch 2004b), which calculates the mutual strength
between N and the constructional slot, there has been a shift towards directional association
measures in more recent work (see Gries 2019). For example, “according” in according to
attracts “to” more than that “to” attracts “according”, since to is a simple preposition. Or the
other way around, “instance” attracts “for” more in for instance than “for” attracts “instance”.
Of course, (near) perfect attraction can exist with unique combinations, such as bona fide (Gries
2019:393). A well-established unidirectional association measures that takes contingency into
account (Ellis and Ferreira-Junior 2009; Levshina 2015) is
, which comes in two variants:
, the details for calculation will be given below
(see 5-6 below). This way it becomes possible to see how much a construction attracts a given
N to its slot, but also conversely how much a given N attracts (or repels) that construction.
Gries (2019) suggests that what then remains is the discussion of the results after bringing
together a number of indicators, such as showing both
, token frequency size, dispersion
etc. In our application of this method, we are dealing with type frequencies of ideophone
inventories. Consequently, we will not go as far, but will provide an extra step of linear
regression on which to base our discussion with regards to the four predictions made above.
5.2 Database analysis: consonants
The application of collostructional methods to ideophone-related studies is not new, see
Smith (2015) for a study on phonesthemes in Old Chinese reduplicatives, or Van Hoey (2020)
for applications to ideophones as they occur in Mandarin constructions. This study adopts
collostructional method for investigating associations between articulatory features and
semantic features. As a reminder, the articulatory features (n = 7 x binary distinction = 14) are
provided in Table 4, and the semantic features (n = 9 x binary distinction = 18) in Table 3. We
counted each logically possible combination of semantic and articulatory features per language,
shown for [wind] and [airflow] in Akan Twi in Table 7 as an example. Calculating this
combination for other languages, together with their respective
values, results in Table 8.
Note that both
values are calculated as follows (5-6). In these formulas, a, b, c, and d stand
for cells in the contingency table, as shown in Table 7.
Table 7: Crosstabulation of wind and airflow in Akan Twi
Based on the counts in Table 7, the
P of [+wind] to be realized with [+airflow] is, i.e.,
+$',-.!(*&$%$+.)(*&#/.(")0 % = % 12
% − % 32
% = %0.24
P of [+airflow] to get involved in the
meaning of [+wind] is, i.e.,
+.)(*&#/.(")0$%$+',-.!(*& % = % 12
% − % 27
% = %0.18
. These numbers
indicate that in both directions there is an attraction between the semantic feature of [+wind]
and the articulatory feature [+airflow] in Akan Twi, with the semantic → articulatory relation
stronger than the opposite. In order to check if there are any associations between an active
articulator [+ articulatory feature] and the absence of a semantic feature [- semantic feature],
e.g., the pair [+ airflow] and [- wind], we have adapted the formulas for the calculation of the
values (7-8). In practice, these values will result in the opposite polarity of the
[+articulatory] ~ [+semantic] pairs. In the case of [+airflow] and [-wind],
$8$',-.!(*&$%$+$.)(*&#/.(")0 % = % −0.24
$+$.)(*&#/.(")0$%$8$',-.!(*& = % −0.18
analyzed the relation between [+ articulatory] ~ [+/- semantic], excluding the [- articulatory] ~
[+/- semantic] pairs, because a core question here is how active articulatory gestures, realized
as positive articulatory features, are correlated to semantic features.
$+',-.!(*&$%$+$.)(*&#/.(")0 % = % .
% − % &
+$.)(*&#/.(")0$%$+$',-.!(*& % = % .
% − % 9
$8$',-.!(*&$%$+$.)(*&#/.(")0 % = % 9
% − % :
$+$.)(*&#/.(")0$%$8$',-.!(*& % = % 9
% − % .
values for the combination [+wind] and [+airflow] across languages are provided
in Table 8 as an example. It can be seen that different languages display different relational
strengths for this semantic-articulatory pair. For this specific relation between [+wind] and
overall shows higher values than
, suggesting that it is more likely to predict an articulatory feature
[+airflow] from a semantic feature [+wind] than vice versa.
of [+wind] and
[+airflow] for all 13 languages
Upper Necaxa Totonac
It is easiest to first glance at the different combinations of
statistics by visualizing them.
The figures for all correlations (see Appendix) show the
correlations with the nine semantic
features for each of the seven articulatory features. The data points are the thirteen different
languages. Warm colors represent the [+ articulatory] ~ [+ semantic] pairs; cool colors the [+
articulatory] ~ [- semantic] pairs. If a datapoint is situated in the upper right quadrant, it means
there is a mutual attraction between semantic and articulatory feature, although not necessarily
of the same strength. In the lower left quadrant, it indicates mutual repellence. Even though
other relations did not occur in our data, a datapoint in the upper left quadrant would indicate
that a semantic feature is more likely to attract an articulatory feature under consideration than
vice versa; and a datapoint in the lower right quadrant would show that an articulatory feature
attracts a semantic feature but this semantic feature does not rely on this articulatory feature at
all. We have also added polygons (brown and turquoise respectively), which display the spread
of the different points, as well as a linear regression line (respectively, red and steel blue) for
each plot, which will be treated below, as the second type of finding. A relatively widespread
polygons indicate that the strengths of the correlations across languages are disperse; a
narrowly scoped area, on the other hands, indicates that languages pattern more closely together
in terms of the
correlations. Let us illustrate the findings with the pairs involving [+airflow]
as an example in Figure 3.
correlations for the
articulatory feature [+airflow] and the nine semantic features. Datapoints in the upper right
quadrant for each panel are said to mutually attract each other; in the lower left quadrant they
mutually repel each other. The polygons indicate the spread of the datapoints. Linear regression
lines have been added as well. Warm colors indicate the values for [+semantic] and
[+articulatory] pairs; cool colors for [- semantic] and [+ articulatory] pairs.
Figure 3 shows the correlations the articulatory feature [+airflow] with the nine semantic
features under our consideration. Prediction (1a) states that [+ airflow] will be associated with
wind or friction. We see that all data points for the pair [+airflow] and [+wind] (Figure 3, lower
right panel) are in the upper right quadrant, indicating that there is attraction from articulatory
feature to semantic feature (x-axis) and also vice versa from semantic feature to articulatory
feature (y-axis). It can thus be said that the prediction is corroborated. For [+ airflow] and [-
wind] we find the reverse: the active articulator [+ airflow] is mutually repellent with regards
to the absence of [wind]. Turning to the pair [+airflow] and [+friction], prediction (1b), (Figure
3, upper right panel), the story is largely the same: almost all of the language datapoints are
situated in the upper right quadrant. As a consequence of these two pairs, when encountering
fricatives in an ideophone, there is a reasonable probability that wind or friction is depicted.
However, do note that wind and friction rely more on the articulatory features than that these
features attract them. In other words, there is mutual attraction, but it is not of the same strength.
In Tables 9-10, we show the semantic-articulatory pairs for which 11 or more of the 13
languages show mutual attraction and mutual repellence. The plausible motivations for these
highly correlating pairings will be discussed below.
Table 9: Mutual attraction (upper right quadrant in the plots, see figures in the appendix)
Table 10: Mutual repellence (lower left quadrant in the plots, see figures in the appendix)
Let us investigate for which pairs the correlations between
values are very tight. We do this by calculating linear
regression models for all pairs. Because we are interested in the predictive ability of the models
rather than the intercept and slope values, we first inspected the F-ratio, omitting ratios smaller
than 1. In this step, no models were omitted. Next, we took out the accompanying p-values that
were greater than 0.05, leaving 60 pairs. Finally, we inspected the adjusted R2 for the models.
Since we wanted to focus on the tightest fits, we chose an arbitrarily cut-off point of 0.90. This
resulted in 11 pairs. Note that this does not mean that other pairs did not have any correlation;
rather, the predictive correlation between
is the strongest for the remaining pairs, which are listed in Table
11. After the consideration of vowels (Section 5.3), the following discussion session (Section
6) will explain implications of significant correlations found from consonants and vowels
against our predictions.
Table 11: The remaining 11 pairs which have the tightest linear regression model
To sum up, both types of findings show that indeed some articulatory properties of
consonants pattern to semantic features corresponding to aspects of ideophone meaning. This
implies that articulatory schematic properties of phonemes, universally accessible to all
speakers, are important in forming the perceptuo-motor analogies that make ideophone
Below we will discuss what may be the reason why some semantic and articulatory feature
pair displays mutual attraction (Table 9), mutual repellence (Table 10) or displays a tight
correlation between its
values (Table 11).
5.3 Database analysis: vowels
The vowels were analysed following the same method as for the consonants. We had 3
main features, each with binary distinction, resulting in 6 features: [+ front]/[+ back],
[+high]/[+low], [+rounded]/[+unrounded], that were paired with the 9 semantic features. Like
with the consonants, we have two types of findings.
As with the consonants, the first analysis concerns the mutual attraction and repellence of
vowel features and semantic features. As can be seen from Tables 12-13, there was no pair that
occurred for all 13 languages in our sample. To be consistent with the consonant analysis, we
set a threshold to 11 languages: if 11 or more languages show the correlation between the two
features, we analysed those correlations. Under this set of criteria, we found 3 significant
correlations (Table 12). For the pair [+motion] and [+unround], while there was an overall
significant correlation between the two, most values are not clustered around the upper right
edge or lower left edge (Figure 4, panel 3). This suggest that a general correlation can be found,
but it is on the weaker end (the maximum absolute value is ca. |0.2|). The pair [+sound] and
[+back] (Figure 4 panel 1) has some datapoints that are more skewed toward an upper right
edge, indicating a somewhat stronger correlation between the two, although the range is also
much wider. The pair [+wind] and [+low] (Figure 4, panel 2) showed the strongest negative
values from among 11 languages. For this correlation one can putatively suggest that the sound
of wind typically is perceived as and thus depicted as higher pitched, resulting in an avoidance
of low vowels across ideophone systems.
Table 12: Mutual attraction (upper right quadrant)
Table 13: Mutual repellence (lower left quadrant)
correlations for the
three vowel features that display mutual repellence and mutual attraction. Note that we have
used a different scale to present these three tables than the one used in all other figures.
Like for consonants, the second analysis focuses on the highest F-ratios, the concomitant
p-value (< 0.05) and the adjusted R2 (> 0.90). As can be seen in Table 14, we find 17 pairs for
values are quite similar: if there is a positive attraction from articulatory feature
to semantic feature, e.g., [+low] and [+telic], there is also an almost equal positive attraction
from semantic feature to articulatory feature. These correlations are meaningful but the
distributions of the correlation points from different languages are scattered as well as clustered
together, meaning that there is no clear systematic patterns in terms of the directions of the
correlations. This distributional tendency is different from the analysis of consonants, where
almost all correlations were found from similar clusters, indicating that languages show similar
distribution of each correlation.
Table 14: The remaining 17 pairs which have the tightest linear regression model
Our analysis shows that certain articulatory properties map to semantic features of
ideophones in almost all 13 languages. Broadly speaking, we have shown that articulatory
properties of phonemes, physiologically accessible to all spoken language users, are
meaningful for ideophones across multiple unrelated languages. This provides empirical
support that ideophones serve as units of depictive movement which function much like iconic
hand gestures except that ideophones are made with the mouth and not the hands (Nuckolls
2000; Dingemanse 2013; 2015; Mihas 2013; Hatton 2016). Our results show that mouth
movement can serve as basis for establishing the depictive nature of ideophones through
analogy between linguistic sounds and what is perceived or observed by speakers. While
studies have shown that perceptuo-motor analogies are attested in novel words improvised by
participants in laboratory settings (Assaneo et al. 2011; Taitz et al. 2018), our study shows that
perceptuo-motor analogies exist in real words, consistently, across multiple languages.
Perceptuo-motor analogies are termed phonosemantic mappings henceforth.
We tested for mutual attraction and repellence to show which articulatory features have
meaningful relations with semantic features and linear correlation for the correlation between
those relations. Such analysis was conducted both for consonants and vowels. The
phonosemantic mappings exhibited by mutual attraction and linear correlation are largely to do
with articulatory properties of consonants, rather than vowels, as predicted. Aside from
[+wind], semantic features pertaining to acoustic information, e.g., [+/-loud], [+/-sound], [+/-
human vocal], [+/- animal vocal], did not pass our analysis and its threshold of significance in
11 or more languages. Our articulatory features, based on movement, do not capture acoustic
depiction very well. This also reflects an already observed disconnect between depictive
movement and its relevance to depicting sound. From her fieldwork on Pastaza Quichua
ideophones, Hatton (2016) found that onomatopoeia, i.e., ideophones depicting sound, almost
never occurred with iconic hand gestures, a stark contrast from the other ideophones she
analysed. It seems likely that movement is irrelevant when the depictive aim is only that of
sound. In a similar vein, the semantic feature [+/- appearance], relating to the depiction of
visual information, did not come through our analysis either. It is not surprising, then, that the
semantic features which did make it through our analysis are monomodal: properties of
movement (articulation) depict properties of movement (motion-related events). Summarily,
movement for movement.
More specifically, our database analysis results show that phonosemantic mappings as
proposed in the literature (§4, predictions 1-3) are supported, while [+/-tongue root] was not
significant for [+/- motion] as claimed by hypothesis (4) and [-airflow] was not significant for
[+telic] in terms of attraction (see Table 11 for linear model). According to our analysis of
mutual attraction, four modes of articulation create robust cross-linguistic patterns with regards
to imitative meaning: lip movement, tongue in resting position, airflow, and involvement of
tongue root. Repellence was shown for the pairs [+velum] and [+friction] as well as [+tongue
root] and [+wind], highlighting the inability or, at least, unlikelihood for certain articulatory
gestures to scaffold certain semantic features. This suggests that the imitative nature of
ideophones is begotten from analogies afforded by such articulatory properties. That is to say,
imitative words to some extent derive their depictive meaning through their articulation,
implying that articulatory properties of speech are a potential route for explaining how
ideophone meanings have been shown to be easily learned and guessed relative to other words
(Lockwood et al. 2016; Dingemanse et al. 2016; Iwasaki 2007a; 2007b). By extension, words
of contested iconic nature could thus be deemed more or less iconic depending on whether their
articulatory properties support such a claim.
Table 15: Predicted phonosemantic mappings and their results
Predicted phonosemantic mapping
see mutual attraction, Table 9
see mutual attraction, Table 9
see mutual attraction, Table 9
[+motion] [+tongue root]
only 10 languages showed mutual attraction, but
see Table 11 for linear model
no mutual attraction attested, but see Table 11 for
If iconicity is imitative due to relations made between sensory percepts and movements
(Dingemanse et al. 2015), then articulatory properties should likewise map to semantic features
for reasons grounded in perceptuo-motor analogy. In Table 16, we propose analogical
explanations that allow these articulatory properties to pattern with their semantic features and
are in turn embedded in ideophones on a sub-phonemic level.
Table 16: Analogical justifications for attraction relations between articulatory and
semantic feature pairs across at least 11 of 13 languages
Justification (≈ analogical to)
continual airflow ≈ air movement
airflow sibilance ≈ sibilance of friction and/or
rubbing together of two surfaces
movement of lips ≈ motion depiction
see [+motion] above, tongue resting allows
for movement of the lips to depict motion
the occlusion of airflow ≈ end of an event
see [+wind] above, tongue resting allows for
airflow to exit the mouth uninhibited
There are few things worth noting regarding the overlap of semantic features. Firstly, the
articulatory feature [+airflow] corresponds to semantic features [+friction] and [+wind] but not
to motion. This does not imply that [+friction] ideophones are not coded for movement related
meaning (as friction must imply some kind of movement). Rather, this implies ideophones
which are not necessarily to do with motion,
and are thus beyond motion on Dingemanse’s
(2012) semantic hierarchy for ideophones, involve [+airflow]. With that in mind, the finding
that [+labial] corresponds to [+motion] would imply that some (not necessarily complete)
occlusion of airflow made by contact with the articulators, is involved in the perceptuo-motor
analogy of [+motion]. This is because [+labial] allows for labio- and labiodental fricatives
which are consonants coded as [+airflow]. However, here we would argue that it is the
movement of the articulators, not the blockage of air, which affords this analogy of movement
and, perhaps, the visible movement of lips. The phonosemantic mapping of [+telic] to [+tongue
root] seems likewise less straightforward. What about tongue root involvement analogically
relates to an event reaching completion? Upon closer inspection, the majority of [+tongue root]
consonants in [+telic] ideophones are /k/ and /ŋ/, both of which involve an occlusion of airflow
in the mouth.
This would make our [+telic] to [+tongue root] phonosemantic mapping
equivalent to our predicted [+telic] to [-airflow] phonosemantic mapping (see Section 4). There
is a phonotactic explanation for why the nasal /ŋ/ maps to [+telic]. It has been proposed that
the coda position of syllables maps to the end of depicted events (see Thompson and Do 2019).
However, not all languages allow stops in the coda position of syllables, such as Akan Twi,
Manyika Shona, Kisi, or Upper Necaxa Totonac, but these languages do allow nasals like /ŋ/
in coda position, thus permitting the occlusion of airflow which in turn affords the [+telic]
mapping. This is the case for Japanese, where only the nasal /n/ is allowed in coda position and
There are very few ideophones in our database which are [+motion] but [-sound]. If ideophones are [+motion]
they are almost always [+sound], implying that the sound is resultative of the motion and somehow semantically
entails it. For example, an ideophone for ‘the sound of footsteps’ would be [+sound] and [+motion]. The reverse
however is not true. For example, ‘the sound of a cow’ or ‘the sound of wind blowing’ is [+sound] but [-motion].
For [+telic] ideophones, the [+tongue root] consonants across all languages: /k/ (138), /ŋ/ (70), /j/ 28/, /x/
(9), /c/ (4), /g/ (4).
ideophones ending in /n/ are considered [+telic] (Akita 2009), e.g., gachagacha ‘clattering’ (of
dishes) versus gachan ‘clank’ (of a single dish being set down).
Finally, since ideophones frequently cooccur with iconic hand gestures—timing with the
peak of the hand gesture, and since we have shown that some ideophones are depictive through
movement, ideophones are perhaps spoken language equivalents of what is known as Echo
Phonology (Woll and Sieratzki 1998; Woll 2001; 2009; 2014), a phenomenon observed in sign
languages whereby mouth movements are timed to hand movements in iconic signs. As with
the articulation of ideophones, mouth movements in the Echo Phonology of sign languages
also differ according to the hand movement with which they occur. In the context of
evolutionary linguistics, Woll (2014) discusses the importance of Echo Phonology as a
contemporary look into how iconic hand gestures can link to or result in speech sounds, and
what this link might mean for the historical emergence of spoken (proto-)language. This paper
also shows that ideophones provide fertile ground for examining this potential link, especially
if future ideophone research involves hand gestures.
Our results show that certain articulatory properties pattern with semantic features while
others do not. Therefore, some perceptuo-motor analogies could be language-specific. These
language-specific results may have come about for a number of reasons. Firstly, phoneme
inventories differ across languages so it is inevitable that some languages make use of certain
articulatory features less than others, e.g., voicing. Crucially, we did not take predictable
phonotactic processes into account when entering the ideophones into our database.
Phonotactic processes could result in the addition or deletion of certain segments in order to
satisfy language-specific phonological rules and thus potentially obscuring and/or skewing the
articulatory features present for imitative purposes only. Another possible reason that some
patterns were not borne out could be due to the kind of semantic features used in this study.
Additional semantic features may have brought more patterns to light. What we would like to
emphasize, however, is that our main goal here was to see if there were any cross-linguistic
articulatory-semantic patterns despite the presence of language-specific phonotactic patterns.
The significance of six phonosemantic mappings (Table 16) show that this is possible.
Future directions of research could look into how syllable structure affects the patterning
of articulatory features with semantic features. Given that we only report correlations between
individual articulatory features and individual semantic features, future tests could look at how
features cluster together, e.g., [+labial] [-airflow] or [+telic] [+motion]. Experimental research
could test the results of our study by seeing whether (1) articulatory feature and semantic
feature patterns are easily learnable for novel words or ideophones, (2) speakers refer to these
articulatory features – or perhaps exaggerate them – when explaining the meaning of
ideophones, as with Dingemanse’s (2015) study on folk definitions of Siwu ideophones.
Overall, our results support phonosemantic mappings grounded in articulatory properties
of phonemes as well as syllable position. Though we did not consider acoustic properties of
phonemes, our findings demonstrate the explanatory power of articulation for imitative
structures in spoken language. Movement is meaningful for constructing imitative units in
spoken language, just as movement is meaningful for shaping visual forms of communication,
such as sign language or hand gestures (Bellugi and Klima 1976; Lieberth and Gamble 1991;
Campbell et al. 1992; Brentari 2010; Lai and Yang 2009; Perniss et al. 2010; Emmorey 2014;
Ortega 2017; Östling et al. 2018; Perlman et al. 2018).
Within Cognitive Linguistics, our results fit within the ongoing investigation of
embodiment (Rohrer 2007; Bergen 2015) and motivation (Radden and Panther 2004). Rather
than studying potential iconicity in the prosaic lexicon (Winter 2019), our study has examined
truly iconic forms of words, i.e., ideophones. By using unidirectional mappings of attraction
and repellence, we have come one step closer to disentangling the cross-linguistic basis of
sound symbolism. However, these deserve further experimental testing to do the cognitive
commitment justice (Lakoff 1991; Dąbrowska 2016): we need converging evidence from other
cognitive sciences beyond linguistics, such as psychology, to further delineate the nature of
iconicity. We also recognize that further study of the socio-linguistic contexts (the socio-
semiotic commitment, see Geeraerts 2016) in which ideophones are used is necessary to paint
an even fuller picture. Finally, we note that the fields of iconicity studies and cognitive
linguistics have been colliding to include gesture (Cienki 2016; Occhino et al. 2017), as
mentioned above. The cross-domain mappings between spoken and visual forms of
communication involve articulatory movement and physical motion, and we have shown a
number of ways in which the cross-linguistic support for these mappings may be realized, and
investigated in future research.
Figure 1: Oral contact
Figure 2: Velum (nasal)
Figure 3: Labial
Figure 4: Airflow
Figure 5: Tongue resting
Figure 6: Tongue root
Figure 7: Vocal folds
Figure 8: Front vowels
Figure 9: Back vowels
Figure 10: High vowels
Figure 11: Low vowels
Figure 12: Rounded vowels
Figure 13: Unrounded vowels
Akita, Kimi. 2009. A grammar of sound-symbolic words in Japanese: theoretical approaches to
iconic and lexical properties of mimetics (日本語音象徴語文法：擬音・擬態語の類像的
・語彙的特性への理論的アプローチ). Kobe: Kobe University PhD dissertation.
Akita, Kimi. 2016. A multimedia encyclopedia of Japanese mimetics: A frame-semantic
approach to L2 sound-symbolic words. Cognitive-Functional Approaches to the Study of
Japanese as a Second Language, 46, 139.
Akita, Kimi, & Dingemanse, Mark. 2019. Ideophones (Mimetics, Expressives). Oxford Research
Encyclopedia of Linguistics. Oxford: Oxford University Press.
Akita, Kimi, Mutsumi Imai, Noburo Saji, Katerina Kantartzis, Sotaro Kita. 2013. Mimetic vowel
harmony. In Bjarke Frellesvig & Peter Sells, (eds.), Japanese/Korean Linguistics 20: 115-
129. Stanford, CA: CSLI Publications.
Alpher, Barry. 1994. Yir-Yoront ideophones. In Leanne Hinton, Johanna Nichols, John J. Ohala
(eds.), sound symbolism, 161–177. Cambridge: Cambridge University Press.
Aryani, Arash. 2018. Affective iconicity in language and poerty: A neurocognitive approach.
Ph.D. dissertation, Freien Universität Berlin.
Assaneo, M. Florencia., Juan Ignacio Nichols & Marcos A. Trevisan. 2011. The anatomy of
onomatopoeia. PLoS ONE: 6: http://dx.doi.org/10.1371/journal.pone.0028317
Ayalew, Bezza Tesfaw. 2013. The submorphemic structure of Amharic: Toward a
phonosemantic analysis. Champaign: University of Illinois at Urbana-Champaign Ph.D.
Beck, David. 2008. Ideophones, adverbs, and predicate qualification in Upper Necaxa Totonac.
International Journal of American Linguistics, 74(1), 1-46.
Bellugi, Ursula, & Edward Klima. 1976. Two faces of sign: Iconic and abstract. Annals of the
New York Academy of Sciences 280(1). 514-538. https://doi.org/10.1111/j.1749-
Bergen, Benjamin. 2015. Embodiment. In Ewa Dąbrowska & Dagmar Divjak (eds.), Handbook
of Cognitive Linguistics (HSK Handbucher Zur Sprach-Und Kommunikationswissenschaft
Band 39), 10–30. Berlin: De Gruyter Mouton.
Blasi, Damián E., Søren Wichmann, Harald Hammarström, Peter F. Stadler, Morten H.
Christiansen. 2016. sound-meaning association biases evidenced across thousands of
languages. Proceedings of the National Academy of Sciences 113(39).
Boersma, Paul & David Weenink. 2019. Praat: doing phonetics by computer [Computer
program]. Version 6.0.53 (https://www.praat.org)
Brentari, Diane. 2010. Introduction. In Brentari, Diane, (ed.), Sign languages: A Cambridge
language survey, 284-311. Cambridge: Cambridge University Press.
Campbell, Ruth, Paula Martin, & Theresa White. 1992. Forced choice recognition of sign in
novice learners of British Sign Language. Applied Linguistics 13(2). 185-201.
Childs, G. Tucker. 1988. The phonology of Kisi ideophones. Journal of African Languages and
Linguistics 10 (2): 165-190.
Cienki, Alan. 2016. Cognitive Linguistics, gesture studies, and multimodal communication.
Cognitive Linguistics 27(4). 603–618. https://doi.org/10.1515/cog-2016-0063.
Chomsky, Noam, & Morris Halle. 1968. The sound Pattern of English. New York: Harper &
Clements, G. N. 1985. The geometry of phonological features. Phonology 2(1), 225-252.
Clements, George N. The Role of the sonority cycle in core syllabification. Papers in Laboratory
Phonology 1 (1990): 283–333.
Dąbrowska, Ewa. 2016. Cognitive Linguistics’ seven deadly sins. Cognitive Linguistics 27(4).
De Carolis, Léa, Egido Marisco, Christophe Coupé. 2017. Evolutionary roots of sound
symbolism. Association tasks of animal properties with phonetic features. Language &
Communication: 54: 21-35.
Diffloth, Gerard. 1972. Notes on expressive meaning. Chicago Linguistic Society, 8: 440–447.
Diffloth, Gerard. 1979. Expressive phonology and prosaic phonology in Mon-Khmer. In T. L.
Thongkum (Ed.), Studies in Mon-Khmer and Thai Phonology and Phonetics in Honor of E.
Henderson (pp. 49–59). Bangkok: Chulalongkorn University Press.
Diffloth, Gérard. 1994. i: big, a: small. In Leanne Hinton, Johanna Nichols & John J. Ohala
(eds.), Sound symbolism, 107–114. Cambridge [England]: Cambridge University Press.
Dingemanse, Mark. 2012. Advances in the cross-linguistic study of ideophones. Language and
Linguistics Compass 6: 654-672. https://doi.org/10.1002/lnc3.361
Dingemanse, Mark. 2013. Ideophones and gesture in everyday speech. Gesture 13(2), 143–165.
Dingemanse, Mark. 2015. Folk definitions in linguistic fieldwork. Language Documentation and
Endangerment in Africa, 215-238.
Dingemanse, Mark. 2018. Redrawing the margins of language: lessons from research on
ideophones. Glossa: A Journal of General Linguistics 3(1):1–30.
Dingemanse, Mark. 2019. “Ideophone” as a comparative concept. In K. Akita, & P. Pardeshi
(Eds.), Ideophones, Mimetics, and Expressives (pp. 13-33). Amsterdam: John Benjamins.
Dingemanse, Mark, Damián E. Blasi, Gary Lupyan, Morten H. Christiansen, Padriac Monaghan.
2015. Arbitrariness, iconicity and systematicity in language. Trends in Cognitive Sciences,
19(10), 603-615. https://doi.org/10.1016/j.tics.2015.07.013
Dingemanse, Mark, Will Schuerman, Eva Reinisch, Sylvia Tufvesson, Holger Mitterer. 2016.
What sound symbolism can and cannot do: Testing the ideophones from five languages.
Language 92: 117-133. https://doi.org/10.1353/lan.2016.0034
Ellis, Nick C. & Fernando Ferreira-Junior. 2009. Constructions and their acquisition: Islands and
the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics 7. 188–221.
Emmorey, Karen. 2014. Iconicity as structure mapping. Philosophical Transactions of the Royal
Society B 369: 20130301.
Forkel, Robert, Johann-Mattis List, Simon J. Greenhill, Christoph Rzymski, Sebastian Bank,
Michael Cysouw, Harald Hammarström, Martin Haspelmath, Gereon A. Kaiping & Russell
D. Gray. 2018. Cross-Linguistic Data Formats, advancing data sharing and reuse in
comparative linguistics. Scientific Data 5. 180205. https://doi.org/10.1038/sdata.2018.205
Frank, Genevieve E. 2014. Ideophones in Manyika Shona: A descriptive analysis of ideophones
and their function in Manyika (Bantu). Albany: State University of New York at Albany
Honors thesis. https://scholarsarchive.library.albany.edu/honorscollege_ling/2.
Geeraerts, Dirk. 2016. The sociosemiotic commitment. Cognitive Linguistics 27(4). 527–542.
Gerner, Matthias. 2005. Expressives in Kam (Dong): A study in sign typology (part II). Cahiers
de Linguistique Asie Orientale, 34(1), 25–67.
Gick, Bryan, Ian Wilson, Karsten Koch, & Clare Cook. 2004. Language-specific articulatory
settings: evidence from inter-utterance resting position. Phonetica 61: 220-233.
Gomi, Taro. 1989. An illustrated dictionary of Japanese onomatopoeic expressions. Tokyo:
Gries, Stefan Th. & Anatol Stefanowitsch. 2004a. Covarying collexemes in the into-causative. In
Michel Achard & Suzanne Kemmer (eds.), Language, culture and mind, 225–236. Stanford,
CA: CSLI Publications.
Gries, Stefan Th. & Anatol Stefanowitsch. 2004b. Extending collostructional analysis: A corpus-
based perspective on ‘alternations’. International Journal of Corpus Linguistics 9(1). 97–129.
Gries, Stefan Th. 2019. 15 years of collostructions: Some long overdue additions/corrections
(to/of actually all sorts of corpus-linguistics measures). International Journal of Corpus
Linguistics 24(3). 385–412. https://doi.org/10.1075/ijcl.00011.gri
Gwet, Kilem. 2002. Kappa statistic is not satisfactory for assessing the extent of agreement
between raters. Statistical methods for inter-rater reliability assessment 1(6). 1–6.
Haiman, John. 2018. Ideophones and the evolution of language. Cambridge: Cambridge
Hamano, Shoko. 1998. The sound-symbolic system of japanese. Tokyo: Center for the Study of
Language and Information.
Hamano, Shoko. 2019. Monosyllabic and disyllabic roots in the diachronic development of
Japanese mimetics. In Kimi Akita & Prashant Pardeshi (eds.), Ideophones, mimetics and
expressives (Iconicity in Language and Literature, ILL 16), 57–75. Amsterdam: John
Hatton, Sarah. 2016. The Onomatopoeic ideophone-gesture relationship in Pastaza Quichua.
Brigham Young University MA Thesis. http://scholarsarchive.byu.edu/etd/6123.
Hinton, Leanne, Johanna Nichols, John J. Ohala. 1994. sound symbolism. Cambridge:
Cambridge University Press.
Hoek, Jet & Merel C.J. Scholman. 2017. Evaluating discourse annotation: Some recent insights
and new approaches. In Proceedings of the 13th Joint ISO-ACL Workshop on Interoperable
Semantic Annotation (isa-13). https://www.aclweb.org/anthology/W17-7401.
Iwasaki, Noriko, David P. Vinson & Gabriella Vigliocco. 2007a. How does it hurt, kiri-kiri or
siku-siku?: Japanese mimetic words of pain perceived by Japanese speakers and English
speakers. In Masahiko Minami (ed.), Applying theory and research to learning Japanese as a
foreign language, 2–19. Newcastle: Cambridge Scholars.
Iwasaki, Noriko, David P. Vinson & Gabriella Vigliocco. 2007b. What do English speakers
know about gera-gera and yota-yota?: A cross-linguistic investigation of mimetic words for
laughing and walking. Japanese-Language Education around the Globe,17: 53–78.
Kanu, Sullay Mohamed. 2008. Ideophones in Temne. Kansas Working Papers in Linguistics, 30:
Kawahara, Shigeto, Atsushi Noto & Gakuji Kumagai. 2018. Sound Symbolic Patterns in
Pokémon Names. Phonetica 75(3). 219–244. https://doi.org/10.1159/000484938.
Kwon, Nahyun & Erich R. Round. 2015. Phonaesthemes in morphological theory. Morphology
25(1). 1–27. https://doi.org/10.1007/s11525-014-9250-z.
Lai, Yu-da & Li-chin Yang. 2009. Iconicity and arbitrariness in Taiwan Sign Language: A
psycholinguistic account. Mingdao Journal 明道學術論壇 5(2). 159–187.
Lakoff, George. 1991. Cognitive versus generative linguistics: How commitments influence
results. Language & Communication 11(1/2). 53–62.
Lemaitre, Guillaume, Olivier Houix, Frederic Voisin, Nicolas Misdariis, & Patrick Susini. 2016.
Vocal imitations of non-vocal sounds. PloS one, 11(12), e0168167.
Levshina, Natalia. 2015. How to do linguistics with R: data exploration and statistical analysis.
Amsterdam ; Philadelphia: John Benjamins.
Li, Jing’er (李鏡兒). 2007. Xiàndài Hànyǔ nǐshēngcí yánjiū 現代漢語擬聲詞研究
[Onomatopoeias in Modern Chinese]. Shànghǎi: Xuélín chūbǎnshè.
Lieberth, Ann K., & Mary Ellen Bellile Gamble. 1991. The role of iconicity in sign language
learning by hearing adults. Journal of Communication Disorders 24(2). 89-99.
Lockwood, Gwilym, Peter Hagoort, Mark Dingemanse. 2016. How iconicity helps people learn
new words: Neural correlates and individual differences in sound-symbolic bootstrapping.
Collabra, 2(1): 7. https://doi.org/1
Maduka, Omen N. 1988. Size and shape ideophones in Nembe: A phonosemantic analysis.
Studies in African linguistics, 19(1). 93–113.
Mathangwane, Joyce T. & Ndana Ndana. 2014. Chiikuhane/Chisubiya ideophones: A descriptive
study. South African Journal of African Languages, 34 (2): 151-157.
McCune, Keith Michael. 1983. The internal structure of Indonesian roots. Ann Arbor: University
of Michigan Ph.D dissertation
McLean, Bonnie. 2020. Revising an implicational hierarchy for the meanings of ideophones,
with special reference to Japonic. Linguistic Typology (aop). https://doi.org/10.1515/lingty-
Mihas, Elena. 2013. Composite ideophone-gesture utterances in the Ashéninka Perené
‘community of practice’, an Amazonian Arawak society from Central-Eastern Peru. Gesture,
Nasu, Akio. 2015. The phonological lexicon and mimetic phonology. In Haruo Kubozono (ed.),
Handbook of Japanese Phonetics and Phonology, 253-288. Berlin: Mouton de Gruyter
Nuckolls, Janis B. 2000. Spoken in the spirit of gesture: Translating sound symbolism in a
Pastaza Quechua narrative. In Joel Sherzer & Kay Sammons (eds.), Translating native Latin
American verbal art, 233–251. Washington, DC : Berlin: Smithsonian Press.
Nuckolls, Janis B. 2019. The sensori-semantic clustering of ideophonic meaning in Pastaza
Quichua. In Kimi Akita & Prashant Pardeshi (eds.), Ideophones, mimetics and expressives
(Iconicity in Language and Literature, ILL 16), 167–198. Amsterdam: John Benjamins.
Nuckolls, Janis B., Joseph A. Stanley, Elizabeth Nielsen, Roseanna Hopper. 2016. The
systematic stretching and contracting of ideophonic phonology in Pastaza Quichua.
International Journal of American Linguistics, 82(1): 95-116.
Nuckolls, Janis B., Todd Swanson, Diana Sun, Alexander Rice, & Sydney Ludlow. 2017.
Quechua Real Words: An audiovisual corpus of expressive Quechua ideophones.
http://quechuarealwords.byu.edu/ (20 October 2018).
Nuckolls, Janis B. & Tod D. Swanson. 2019. Quechua Real Words: An audiovisual ANTI-
dictioanry of expressive Quechua ideophones. http://quechuarealwords-
dev.byu.edu/index.php (12 March, 2019).
Occhino, Corrine, Benjamin Anible, Erin Wilkinson & Jill P. Morford. 2017. Iconicity is in the
eye of the beholder: How language experience affects perceived iconicity. Gesture 16(1).
Oda, Hiromi. 2000. An embodied semantic mechanism for mimetic words in Japanese.
Bloomington: Indiana University.
Ofori, Seth Antwi. 2009. A morphophonological analysis of onomatopoeic ideophones in Akan
(Twi). In Jonathan C. Anderson, Christopher R. Green, & Samuel G. Obeng (eds.), IUWPL8:
African Linguistics Across the Discipline, 11-44. Bloomington: IULC Publications.
Ortega, Gerardo. 2017. Iconicity and sign lexical acquisition: A review. Frontiers in Psychology,
8: 1280. https://doi.org/10.3389/fpsyg.2017.01280.
Östling, Robert, Carl Börstell, Servane Courtaux. 2018. Visual iconicity across sign languages:
large-scale automated video analysis of iconic articulators and locations. Frontiers in
Psychology (9) 725. https://doi.org/10.3389/fpsyg.2018.00725
Oswalt, Robert. L. 1994. Inanimate imitatives in English. In Leanne Hinton, Johanna Nichols,
John J. Ohala (eds.), Sound symbolism, 293-306. Cambridge: Cambridge University Press.
Perlman, Marcus, Hannah Little, Bill Thompson & Robin L. Thompson. 2018. Iconicity in
signed and spoken vocabulary: A comparison between American Sign Language, British Sign
Language, English, and Spanish. Frontiers in Psychology 9. 1433.
Perlman, Marcus & Gary Lupyan. 2018. People can create iconic vocalizations to communicate
various meanings to naïve listeners. Scientific reports, 8(1), 2634.
Perlman, Marcus, Rick Dale & Gary Lupyan. 2015. Iconicity can ground the creation of vocal
symbols. Royal Society open science, 2(8), 150152.
Perniss, Pamela, Robin L. Thompson & Gabriella Vigliocco. 2010. Iconicity as a general
property of language: Evidence from signed and spoken languages. Frontiers in Psychology
(1) 227. https://doi.org/10.3389/fpsyg.2010.00227
Radden, Günter & Klaus-Uwe Panther (eds.). 2004. Studies in linguistic motivation (Cognitive
Linguistics Research 28). Berlin ; New York: Mouton de Gruyter.
Rohrer, Tim. 2007. Embodiment and experientalism. In Dirk Geeraerts & Hubert Cuyckens
(eds.), The Oxford handbook of Cognitive Linguistics, 25–47. Oxford: Oxford University
Schakow, Diana. 2016. A Grammar of Yakkha. Berlin: Language Sciences Press.
Stefanowitsch, Anatol & Stefan Th. Gries. 2003. Collostructions: Investigating the interaction of
words and constructions. International Journal of Corpus Linguistics 8(2). 209–243.
Strickland, Brent, Jeremy Kuhn., Philippe Schlenker, Carlo Geraci. 2017. Intuitive iconicity for
events and objects: telicity and the count/mass distinction across modalities. Workshop on
Event Representations in Brain and Language Development. Nijmegen: MPI for
Psycholinguistics. Oct 27-28, 2017.
Taitz, Alan, M. Florencia Assaneo, Natalia Elisei, Mónica Trípodi, Laurent Cohen, Jacobo D.
Sitt & Marcos A. Trevisan. 2018. The audiovisual structure of onomatopoeoias: An intrusion
of real-world physics in lexical creation. PLoS ONE: 13(3): e0193466.
Talmy, Leonard. 2000. Toward a Cognitive Semantics: Volume I: Concept structuring systems
(Language, Speech, and Communication). Cambridge, MA: MIT Press.
Thompson, Arthur Lewis & Youngah Do. 2019. Defining iconicity: An articulation-based
methodology for explaining the phonological structure of ideophones. Glossa: A Journal of
General Linguistics. 4(1). 72. https://doi.org/10.5334/gjgl.872.
Van Hoey, Thomas. 2018. Does the thunder roll? Mandarin Chinese meteorological expressions
and their iconicity. Cognitive Semantics, 4:2. doi: 10.1163/23526416-00402003.
Van Hoey, Thomas. 2020. Prototypicality and salience in Chinese ideophones: A cognitive and
corpus linguistics approach. Taipei: National Taiwan University PhD dissertation.
Van Hoey, Thomas. in print. A semantic map for ideophones. In Thomas Fuyin Li (ed.)
Handbook of Cognitive Semantics (ch. 16). Leiden: Brill.
Van Hoey, Thomas & Arthur Lewis Thompson. 2020. The Chinese Ideophone Database
(CHIDEOD). Cahiers de linguistique Asie orientale 49(2). 136–167.
Voeltz, Erhard Friedrich Karl & Christa Kilian-Hatz (eds.). 2001. Ideophones (Typological
Studies in Language 44). Amsterdam: John Benjamins.
Wang, Shasha (王沙沙) & Tang Yunfeng (湯允鳳). Hànyǔ nǐshēngcí yǔ Wéiwúěryǔ mónǐcí
duìbǐ qiǎnxī 汉语拟声词与维吾尔语模拟词对比浅析 [A comparative study on
onomatopoeia of Chinese and Uyghur]. Yǔyán yǔ Fānyì (1): 34-37.
Waugh, Linda R. 1994. Degrees of iconicity in the lexicon. Journal of Pragmatics 22. 55–70.
Winter, Bodo. 2019. Sensory Linguistics: Language, perception and metaphor (Converging
Evidence in Language and Communication Research 20). Amsterdam: John Benjamins.
Woll, Bencie. 2001. The sign that dares to speak its name: Echo phonology in British Sign
Language (BSL). In P. Boyes-Braem & R. L. Sutton-Spence (eds.), The hands are the head
of the mouth, 87–98. Hamburg: Signum Press.
Woll, Bencie. 2009. Do mouths sign? Do hands speak?: Echo phonology as a window on
language genesis. In R. Botha & H. de Swart (eds.), Language evolution: The view from
restricted linguistic systems, 203-224, Utrecht: LOT Occasional Series.
Woll, Bencie. 2014. Moving from hand to mouth: Echo phonology and the origins of language.
Frontiers in Psychology 5: 662–662. https://doi.org/10.3389/fpsyg.2014.00662.
Woll, Bencie & Jechil S. Sieratzki. 1998. Echo phonology: Signs of a link between gesture and
speech. Behavioral and Brain Sciences 21(4). 531–532.
Xiao, Chun 曉春. 2015. Mǎnyǔ nǐshēngcí chúyì 满语拟声词刍议 [Primary Research of Manchu
Onomatopoetic Words]. Manchu Studies 60(1), 19-23.
Yakpo, Kofi. A Grammar of Pichi. Language Science Press, 2019.
Zhang, Sheng Yu 張盛裕. 2016. Chaoyang Fangyan Yanjiu 潮陽方言研究 [Research on the
Chaoyang Dialect]. Beijing: Social Sciences Academic Press.