Abstract
While the general acoustic mechanisms that explain the development of tone in language have been understood since at least Maspero (1912. Étude Sur La Phonétique Historique de La Langue Annamite: Les Initiales. Bulletin de l’École Française d’Extrême-Orient 12. 1–126), we are still far from having a predictive theory of tonogenesis. Kurtöp, a Tibeto-Burman language of Bhutan shown to be undergoing tonogenesis, provides a rare opportunity to advance our understanding of how and why languages develop lexical tone. This study examines the role that sonority and place of articulation have in the spread of tone from voicing contrasts on preceding consonants in Kurtöp. First, we find that tone is more likely to be produced following fricatives than when following stops. Second, we see that within the stops, tone phonologises more readily following some places of articulation over others. Taken as a whole, this shows us that tone is moving through Kurtöp, following the most sonorous segments first and moving to the least sonorous segments. These findings thus help us refine our theory of tonogenesis and show that functional pressures have strong influences in this particular pathway of sound change.
1 Introduction
Since Haudricourt’s (1954) ground-breaking proposal of tonal development in Vietnamese, we have made considerable progress toward a theory of tonogenesis. It is generally accepted that tone enters a system through a process of lost codas creating a contour tone on the preceding vowel. Tone splits can then be conditioned through voicing contrasts in onsets, usually resulting in high and low tone. Thurgood (2002) expanded our understanding of tonogenesis by examining the role of intermediate phonation between the existing phases of the tonogenetic model. The last sixty years have seen substantial research on this topic, yet there are still considerable unknowns regarding this sound change.
Hyslop (2009) expanded this body of work by examining tonogenesis in progress in the Tibeto-Burman language Kurtöp. Her study examined syllable-initial stops through an acoustic comparison of voice onset time (VOT; Lisker and Abramson 1964) and f0 on the vowel following stops. She addressed the tonal properties (high and low) following the different consonants, (specifically, sonorants and palatal fricatives), and the ongoing merger between voiced and voiceless stops. She found that tone is in the process of replacing the existing contrast in voice for stop initial syllables. However, preliminary results then suggested that the replacement was not occurring equally frequently in all phonological environments (i.e. different manners and places of articulation appeared to behave differently).
This article aims to address the questions left by Hyslop (2009). First, we aim to replicate Hyslop (2009), using a larger sample size of speakers. The second goal of this article is to compare the extent to which tone is fully phonologised in some aspects of the phonology as compared to others. Specifically, we look at f0 on vowels following fricatives versus stops, and different places of articulation of stops. This article has the following structure. §1 continues relevant background, including a discussion of phonologisation, tone and tonogenesis, and an introduction to Kurtöp. §2 presents the methods for the current study. Results are presented next, focussing on the differences between stops versus fricatives in §3 and place of articulation amongst the stops in §4. A discussion follows in §5, highlighting the findings in a theoretical light. We conclude in §6, arguing that intrinsic properties like sonority and place of articulation may be immutable driving factors in sound change.
1.1 Tone and tonogenesis
Yip (2002: 1) distinguishes tone from intonation when ‘the pitch of the word can change the meaning of the word’… in terms of its ‘core meaning’. Traditional definitions of tone are often associated with individual syllables, though many languages only have tone at the word level. DeLancey (2003), for example, describes this type of system in Lhasa Tibetan.[1] This type of melodic tone system is seen in many tone languages around the world, including in Kurtöp (discussed below).
Linguistics has long been concerned with how sounds, including tones, change over time, dating back at least to the Neogrammarian hypothesis of the late 19th century. As Phonetics advanced as a discipline, it became increasingly apparent that phonological contrasts are often realised and perceived by a suite of different acoustic features, making the issue of sound change more complex and harder to quantify. For example, ‘voicing’ in English can be reliably realised by VOT in word-initial position but in word-final position the contrast is often saliently seen in the length of the preceding vowel. Duration of stop closure is also a relevant parameter that can be used to capture the distinction between ‘voiced’ and ‘voiceless’ English stops. Multiple acoustic cues have been particularly noted in Korean lenis stops (e.g. Silva 2006; Kang and Guion 2008, and many follow up studies) and other languages such as Cao Bằng Tai (Pittayaporn and Kirby 2017), Assamese (Dutta and Kenstowicz 2018) and Dzongkha (Kirby and Hyslop 2019), amongst many others. The issues of these multiple cues and how they are utilised in speech can be referred to as ‘cue-weighting’, which is succinctly summarised by Schertz and Clare (2019). They consider cue-weighting in both production and perception and discuss the theoretical link between the two, and along the way provide several clear illustrations of cue-weighting, or how speakers and listeners produce and perceive the most important cues along many competing dimensions.
The issue of multiple cues is also relevant for studies of sound change. It is established that in contexts where multiple cues are documented to be part of a phonemic contrast, the issue of which cue(s) are most salient in production or perception is non-trivial. This applies especially along the diachronic dimension, where one cue may be found to be replacing another as having primary importance. Kirby (2013) examines situations in which multiple cues are involved in sound change. He takes the example of Seoul Korean, in which VOT, f0, spectral tilt and amplitude of the release burst are all potentially relevant perceptual cues to denote the laryngeal setting of the contrast in onset position (Cho et al. 2002; Wright 2007), yet only f0 appears to be the relevant contrast for young speakers today. We can consider such cases as examples of transphonologisation, in which the reliance has shifted from one cue to another over time. The term was first coined by Hagège and Haudricourt (1978), to describe the replacement of one distinctive opposition with another. Looking at sound change in a complex phonetic domain, we can consider transphonologisation to be the shift from primary weighting of one cue to that of another. We further make the assumption in this article – describing a case of tonogenesis – that the shift from a primary acoustic cue of voicing (VOT and presence of voicing in fricative onsets) to a primary cue of f0 on the following vowel represents an example of transphonologisation, signalling a phonological shift, from that of the consonantal domain (voicing) to that of the suprasegmental domain (tone).
Tonogenesis, or the birth of tone, has already been discussed in many languages, including Athabaskan languages (Kingston 2005), Chinese (Chen 2000), Cèmuhî (Rivierre 1993), Kammu (Svantesson and House 2006), Lahu (Matisoff 1970), and Kurtöp (Hyslop 2009), amongst many others. Although the term ‘tonogenesis’ was first used by Matisoff (1970) to describe the phenomenon in Lahu, the discussion of tone from a diachronic perspective probably began with Maspero (1912) and was brought into the modern linguistic spotlight by the prominent work of Haudricourt (1954)’s analysis of Vietnamese tone. Using Vietnamese, Haudricourt discusses tone development through a series of sound changes. These steps, listed below, have become the primary model of tonogenesis and are found widely in the literature on tonogenesis.
Contour melodies develop on syllables ending in laryngeal consonants, [h] and [ʔ], while syllables without laryngeals either maintain a level pitch, or in the case of syllables with a stop final, did not exhibit any tonal contours
The laryngeal finals are lost over time and three tonal environments phonologise in their absence: rising (ʔ), falling (h), and level (non laryngeal, non stop finals)
Tone splits result from the merger of voiceless and voiced initials and associated laryngeal features on the following vowel
Parts of this model can be found in many examples of tonogenesis. The first two steps involving lost codas are less prevalent around the world, while the process involving initials (step 3) is the most widely reported and represented within tonogenetic literature (Ratliff 2015). The focus in this article is the development of high versus low tone registers from a voicing contrast in onsets (Haudricourt’s step 3), although this really is part of the initial introduction of tone into Kurtöp because step 1 has been specifically used for Haudricourt (1954).
The above model has been updated to account for more intermediate stages. For example, Thurgood (2002) argues that features such as ‘breathy voice’ and ‘creaky voice’ also played a role in the development of Vietnamese tone. However, this does not explain the development of tone in hundreds of other languages. In some languages, such as Arapaho (Algonquian language of central North America; Goddard 1991) and Central Tibetan (DeLancey 1989) an entire syllable was lost in words, leading to a tone on the preceding syllable. In other languages, such as Southeastern Monguor (Mongolic language of China; Dwyer 2008) and Balsas Nahuatl (an Uto-Aztecan language of Mexico; Guion et al. 2009) the stress system interacted with other changes in the language, leading to the development of tone. Other factors that have been reported to be involved include vowel length, vowel height, aspiration of consonants before and after vowels, and several others. However, very few of these processes have been studied in any depth; most of our understanding of tonogenesis in these contexts comes from only a handful of words compared across similar languages or dialects. Further, a brief survey of the literature reveals these small cases actually comprise the majority of instances of tonogenesis (Hyslop 2023). It is therefore paramount to study these cases in depth in order to advance our understanding of how and why languages develop tone.
There has also been considerable work in the past decade or so, documenting what happens with tone once it enters language, also called tone change (Dockum 2019; Ratliff 2015). The current study is interested in tonogenesis as a whole, and while tone is already established in Kurtöp, the term ‘tone change’ is less applicable. The tones in Kurtöp are not changing; rather, tone is synchronically present in a small phonological domain (following sonorant consonants and the palatal fricative) and is currently spreading to other phonological domains (following all consonants).
1.2 Kurtöp
Kurtöp is an East Bodish (<Tibeto-Burman) language indigenous to Bhutan. The East Bodish languages are closely related to Tibetic languages but are not direct descendants of Old Tibetan. The precise relationship between Tibetic and East Bodish languages is a matter of ongoing research, which is further complicated by long term, heavy contact between East Bodish and Tibetic languages (see e.g. Hyslop 2021b). There are approximately 4,000–5,000 speakers of Kurtöp[3] in the Lhüntsi district, spread across perhaps a dozen different villages[4] and six mutually intelligible dialects.[5] One grammar of the language has been written (Hyslop 2017). The Kurtöp-speaking region of Bhutan is shown in Figure 1.

Kurtöp-speaking region in Bhutan.
All East Bodish languages are tonal, and the current state-of-the-art shows that they all have tone systems similar to that described for Kurtöp. That is, they all have tone fully contrastive following sonorant onsets and the palatal fricative and incipient tone following the other obstruents. Kurtöp phonology was previously described by Michailovsky and Mazaudon (1994) and Hyslop (2009, 2017. This section will provide a brief summary, based on Hyslop (2017).
The nine syllable structures of Kurtöp are given in Table 1. Note that the simple, short vowel appears to be found rarely and only as one syllable in a multisyllabic word, as in the first syllable of ‘food’.
Syllable structure in Kurtöp.
Syllable shape | Kurtöp example | Gloss |
---|---|---|
V | í(.pɐ) | ‘food’ |
VV | é: | ‘who’ |
VC | ím | ‘hide.irr’ a |
CV | bɐ̀ | ‘target’ |
CVV | kó: | ‘hoe’ |
CVC | gòr | ‘rock’ |
CCV | bɟɐ̀ | ‘ash’ |
CCVV | brɐ̀: | ‘scratch.irr’ |
CCVC | pʰrúm | ‘cheese’ |
- a
IRR – Irrealis mood (Hyslop 2017).
Syllables can be made up of a minimum of a rime, which can be a short or long vowel, diphthong, or a short vowel plus coda [-p, -t, -k, (-s), -m, -n, -ŋ, -r, (-l)]. The -s coda is not found in all dialects of Kurtöp and is environmentally conditioned in the varieties for which it is found. The -l coda has been lost except when conditioned by verbal suffix apocope and loan words (see Hyslop 2017). The maximum syllable shape is complex onset plus rime.
Previous work has established that Kurtöp has a three-way voicing contrast in stops between voiceless, voiceless aspirated, and voiced,[6] and a two-way contrast in dental fricatives (voiced and voiceless). Any of Kurtöp’s 31 consonants, listed in Table 2, can be used in onset position, and an additional thirteen onset clusters are permissible: /pr-, pʰr- pj-, pʰj-, pl-, br-, bj-, bl-, kw-, kʰw-, gw-, mr-, mj-/. The onset clusters are undergoing a process of simplification, including the merger of /bl-/ and /br-/, the simplification of /py-/ to [pç∼pc∼c], of /mj-/ to /ɲ/, and of /kʰr, kr, gr/ to /ʈ, ʈʰ, ɖ/.
Kurtöp phonemic consonant inventory given in IPA.
labial | dental | retroflex | palatal | velar | glottal | |
---|---|---|---|---|---|---|
stops | p, pʰ, b | t, tʰ, d | ʈ, ʈʰ, ɖ | c, cʰ, ɟ | k, kʰ, g | (ʔ) |
affricates | ts, tsʰ | |||||
fricatives | s, z | ç | h | |||
nasals | m | n | ɲ | ŋ | ||
laterals | l, ɬ | |||||
rhotics | r | |||||
glides | w | j |
Kurtöp has a basic vowel system with the five cardinal vowels, /i, e, ɐ, o, u/, and four diphthongs, /ɐu, iu, ui, oi/.[7] Michailovsky and Mazaudon (1994) included /ai/ in the diphthong inventory, but Hyslop (2017) reports that this has since undergone a sound change /ai/>/e/, in Kurtöp. It appears that Kurtöp diphthongs, like its complex consonant cluster onsets, are undergoing a process of simplification.
Kurtöp stress is predictably found on the initial syllable of a root and realised primarily through the acoustic correlate of duration.[8] Complex onsets and long vowels are also only found on stressed syllables.[9] Stressed syllables almost always bear tone as well, but tone can be separated from stress in that it must be word initial, while stress is root initial. The only instance in which these two circumstances are not identical is when the negative prefix (the only prefix found in the language) is used, causing the tone to move left to the word-initial syllable.
Tone in Kurtöp is found in monosyllables and in word-initial position. In the case of multisyllabic words, then, only the first syllable of the word has tone and the remaining syllables follow a predictable melody: HL or LH (only disyllables have been quantified).
Michailovsky and Mazaudon (1994) presented the first evidence for tonal contrasts in Kurtöp, showing tone as contrastive following the sonorants. They also noted the predictable tone register of the obstruent series, stating that ‘two tonal registers, high and low, correlated with voicing oppositions in the initial consonant, and remnants of initial clusters’ (546). Hyslop (2009, 2017 has corroborated, and expanded upon, this early description of Kurtöp tone.
Tone is contrastive following all sonorants and the palatal fricative, as shown in Table 3. Elsewhere, tone is predictable, with high tone following the remainder of the voiceless obstruents and low tone following the voiced obstruents; see Table 4.
Contrastive tone in Kurtöp.
High tone | Gloss | Low tone | Gloss |
---|---|---|---|
mɐ́ŋ | ‘community; crowd; everyone’ | mɐ̀ŋ | ‘be.excessive’ |
nɐ́m | ‘Perilla frutescens’ | nɐ̀m | ‘sky; weather’ |
ɲú | ‘be.crazy’ | ɲù | ‘borrow’ |
nɐ́p | ‘dry.out’ | ŋɐ̀p | ‘be.thin’ |
rúŋ | ‘make.stand; get up’ | rùŋ | ‘small.storage.basket’ |
lém | ‘flat.spoon’ | lèm | ‘be.delicious’ |
wɐ́ŋ | ‘blessing’ | wɐ̀ŋ | ‘pit’ |
jɐ́p | ‘awning’ | jɐ̀p | ‘wear.on.shoulders’ |
çɐ́m | ‘shoes’ | çɐ̀m | ‘man’s.length.measurement’ |
Environmentally conditioned predictable tone in Kurtöp.
High tone | Gloss | Low tone | Gloss |
---|---|---|---|
pɐ́ | ‘meat slice’ | bɐ̀ | ‘target’ |
pʰɐ́t | ‘leech’ | ||
tɐ́ | ‘axe’ | dɐ̀ | expletive |
tʰɐ́ | weaving pattern | ||
ʈɐ́ | ‘change of color’ | ɖɐ̀ | ‘praise’ |
ʈʰɐ́ŋ | ‘climb’ | ||
cɐ́ro | ‘friend’ | ɟɐ̀ | ‘tea’ |
cʰɐ́ | ‘pair’ | ||
kɐ́ | ‘snow’ | gɐ̀ | ‘saddle’ |
kʰɐ́ | ‘language; mouth’ | ||
tsɐ́ | ‘nerves’ | ||
tsʰɐ́ | ‘salt’ | ||
sɐ́ | ‘soil’ | zɐ̀m | ‘bridge’ |
The following discussion is a summary of Hyslop (2009). Kurtöp can be shown to be developing tone in the following order: 1) tone phonologises following sonorant onsets, conditioned by diachronic onset clusters; 2) tone phonologises following palatal fricatives; 3) tone now predictable following all obstruents (low pitch following voiced; high pitched following voiceless) and stops are in the process of devoicing. That is, tone has replaced a contrast in voicing following sonorant onsets and the palatal fricative; and tone is currently in the process of replacing a voicing contrast following the remainder of the obstruents.
Using comparative evidence, Hyslop (2009) demonstrated how tone first phonologised following sonorants from s- initial sonorant clusters. In Written Tibetan, a cousin language to Kurtöp, s- sonorant clusters are found in words in which Kurtöp has high tone. To these, we can add k- initial sonorant clusters as well, based on ongoing comparative work. For example, Hyslop (2014) reconstructs *kwa ‘tooth’ and *kram ‘otter’ to Proto East Bodish, which lead to wá and rám in Kurtöp, respectively. Comparative data showing the proposed consonantal sources for the high tone following sonorants in Kurtöp is shown in Table 5 below.
Comparative data suggesting source of high tone following sonorants in Kurtöp.a
Gloss | Kurtöp | Upper Mangdep | Bumthap | Khengkha | Written Tibetan | PTB |
---|---|---|---|---|---|---|
‘hair’ | rɐ́ | r̥ɐ́ ∼ rɐ́ | kra | kra | skra | |
‘tooth’ | wɐ́ | wɐ́ | kwa | kwa | swa | |
‘nose’ | nɐ́ | náphaŋ | nabli | sna | ||
‘tongue’ | lé | *s-l(y)a | ||||
‘otter’ | rɐ́ | kram | sram |
-
aIt is important to bear in mind that data for many East Bodish languages (that is, the sister languages of Kurtöp) is still lacking, as evidenced by the blank cells in the table above. In order to help fill in the gaps, we have sought Written Tibetan (as a ‘cousin’ language) and Proto Tibeto-Burman forms as reconstructed in Matisoff (2003). While these other forms do not give strong evidence for an immediate consonantal conditioning trigger, they are suggestive we would find one in a closely related language.
Hyslop (2009) speculated that the sonorant would assimilate to the voiceless s- creating a voiceless sonorant and at some point, the pitch conditioned by either the voiceless sonorant or the preceding consonant would have phonologised into tone (i.e. a transphonologisation of voicing in onset to tone on vowel). Low tone is then phonologised following the voiced sonorants. Minimal evidence has been found for this intermediate devoicing step in the Upper Mangdep (Phobjip dialect) form for ‘hair’, which is sometimes realised with a voiceless rhotic. This fits with what is known about voiceless sonorants being ‘pitch raisers’ (L-Thongkum 1997: 1082). We hope more comparative data will become available – especially for acoustic analysis – soon.[10] , [11] Until then, the precise acoustic details that explain the transfer of a contrast in the consonantal onset domain to one of tone in the rime remain a matter of speculation. Nonetheless the fact remains that tone has phonologised following the sonorant consonant initials first.
The palatal fricative /ç/ is argued to have recently phonologised tone on the following vowel in Kurtöp. Michailovsky and Mazaudon (1994) remark the predictable tone register found on obstruents with the following caveat, ‘voicing is often absent in pronunciation, leaving only the low tone to insure the contrast. Thus, ʑ- is usually pronounced ᶫɕ-’ (547). The fact that Hyslop (2009, 2017 never encountered the voiced palatal fricative, only encountering the voiceless with low tone series, suggests that the palatal fricative was likely undergoing a process of tonogenesis when Michailovsky and Mazaudon conducted their fieldwork in the 1970s, and has since phonologised following voiceless palatals.
Hyslop (2009) offered acoustic evidence that tone is phonologising following the stops via an acoustic study involving two speakers. She showed that f0 on the midpoint following voiced versus voiceless stops was statistically significant for both speakers. At the same time, she also showed that the VOT values of the phonologically voiced series were merging with the VOT values for the voiceless series. Taken together, Hyslop (2009) proposes that tone first phonologised following the voiceless and voiced category of stops and that the voiced stops are now merging with the voiceless stops. Finally, she proposes a follow up study looking into the role of sonority in tonogenesis, noting that tone had reportedly entered Tibetan, Tshangla and Tai following the sonorants in a similar way to Kurtöp (Hyslop 2009: 843).
As mentioned in §1.2, previous studies in tonogenesis have found that breathy voice can be a crucial, mediating step between a contrast in voiced obstruents and low tone on following vowels. This was made explicit in Thurgood (2002) and many studies have since observed this (see for example Pittayaporn and Kirby 2017; Kirby and Hyslop 2019). However, in fifteen years of working with multiple speakers of the language, we have never once heard breathy voice uttered by a Kurtöp speaker, even those who are fluent in Dzongkha. This is in contrast to Dzongkha, the national language of Bhutan and with which the first author is also familiar. As such, no measurements have been made to quantify (its absence).[12] In this way, Kurtöp is similar to Afrikaans (Coetzee et al. 2018), Khmu (Svantesson and House 2006) and Malagasy (Howe 2017), in showing that indeed some languages can move directly from voicing to tone without requiring mediating voice quality.
While Hyslop (2009) was able to convincingly argue that a contrast in VOT of stops was being partially replaced by a contrast in f0 on the following vowel, she was not able to investigate whether all stops behaved equally, or how the fricatives participated in the sound change.
2 Methods
This study has been designed to build from Hyslop (2009) and so follows it closely in design. More specifically, we have the following aims:
To determine if the observation that voicing distinctions are being lost in favour of a high and low tone contrast holds true in fricatives.
To determine if there is a correlation between level of sonority and development of tone contrast
To determine if the observation that voiced stops are becoming voiceless in favour of high and low tone contrast is statistically distinct across all places of articulation
Since Kurtöp has word-level stress, we controlled for word stress and tone variation by selecting only monosyllabic words. We attempted to balance the list for: voice type (voiceless, voiceless aspirated, and voiced), manner (stops, affricates, fricatives, nasals, laterals, rhotics, and glides), place (bilabial, dental, retroflex, palatal, velar, glottal), and vowel quality of following vowel (low, non-low front, and non-low back). Even though there were more stop tokens because of the three-way voicing contrast at five places of articulation, we were still able to balance the tokens between dental stops and dental fricatives, in order to compare the effects of sonority within the obstruents. An attempt was made to produce an equal number of tokens for the other balancing factors (in many cases vowel quality) as possible.
2.1 Speakers
Two speakers were recorded in Thimphu, Bhutan, in 2009.[13] KT is a male in his forties. He is from Tabi and in addition to being a native speaker of Kurtöp, he also speaks Dzongkha, Hindi and some English. Ch is a female in her early sixties. She is from Dungkhar, and in addition to being a native speaker of Kurtöp, she also speaks Dzongkha, Tshangla, Chocangaca, and Bumthap.
We recorded three additional speakers in Canberra, Australia in December of 2015. KL is a male speaker in his early thirties. He is from Tabi but left the village when he was six years old, though he returns annually. Since leaving, he has lived in Dzongkha-speaking Thimphu, with the exception of three years in Trashigang, two years in Paro, and two years in Canberra, Australia. In addition to being a native speaker of Kurtöp, he also speaks Dzongkha, Nepali, Tshangla, Hindi, and English. He has a master’s degree that he obtained in English instruction in Australia. DM is a female in her early thirties. She is from Shawa but left the village when she was five and has only returned three times since. She lived in Trashigang for six years, Samtse for four years, Tsirang for three years, Paro for five years, Wangdi Phodrang for seven years, and Canberra, Australia for two years. In addition to being a native speaker of Kurtöp, she also speaks Dzongkha, Nepali, Hindi, and English. SD is a male speaker in his mid-thirties. He is from Dungkhar but left when he was between fifteen and sixteen years old. He has lived in Mongar, Thimphu, Samdrupjongkhar, and Canberra, Australia. In addition to being a native speaker of Kurtöp, he also speaks Dzongkha, Nepali, Tshangla, and English. The speakers were found on a voluntary basis through personal networks.
2.2 Materials
A total of 3,540 monosyllabic consonant initial tokens were recorded and analysed acoustically. The elicitation was done using a wordlist of 216 words (see Appendix 1), elicited in English and an orthographic representation of Kurtöp. The wordlist was given to each speaker and reviewed for clarification and meaning prior to recording. The speakers then produced each target word three times and a fourth time in the carrier phrase shown in (1).
ŋai | ______________ | lap-male |
1.ERG | say-FUT | |
‘I will say | “______________” | ’ |
List intonation was found to be prevalent in the three words spoken in isolation prior to the carrier phrase. There was often a rising contour in the first token and a falling contour in the third token; however, similar to Hyslop (2009), since this was true regardless of the initial consonant’s manner, voice type, or place of articulation, and the analysis was based on level of the pitch and not contour, the inclusion of all tokens, despite list intonation, should affect all data in a predictable and consistent way and will therefore not skew any results.
All utterances were included in the analysis with the exception of a few omitted for mispronunciation. Occasionally the target word was repeated four or five times prior to the carrier phrase, which yielded additional tokens for the word within that speaker. Sometimes, a speaker would not be familiar with a target word, and it was therefore omitted for that speaker. Due to these unforeseen circumstances, the data were not completely balanced for all controls (manner, place, voice type, vowel quality) across all speakers. Notably, there was an imbalance found in the aspirated palatal stops (about 8 tokens for all speakers except speaker KT), and the aspirated retroflex stops (about 12 tokens for all speakers). The total number of tokens for each control is given per speaker is given in Table 6.
Total number of tokens analysed in acoustic study.
Speaker | Manner | Voice type | Labial | Dental | Retroflex | Palatal | Velar | Totals |
---|---|---|---|---|---|---|---|---|
CH | Stops | Voiced | 82 | 40 | 43 | 24 | 47 | 236 |
Voiceless | 77 | 44 | 42 | 29 | 60 | 252 | ||
Aspirated | 57 | 40 | 12 | 8 | 40 | 157 | ||
Fricatives | Voiced | 37 | 37 | |||||
Voiceless | 38 | 38 | ||||||
DM | Stops | Voiced | 83 | 40 | 40 | 24 | 49 | 236 |
Voiceless | 72 | 40 | 36 | 28 | 60 | 236 | ||
Aspirated | 64 | 40 | 11 | 8 | 40 | 163 | ||
Fricatives | Voiced | 36 | 36 | |||||
Voiceless | 36 | 36 | ||||||
KT | Stops | Voiced | 79 | 36 | 40 | 40 | 52 | 247 |
Voiceless | 60 | 40 | 40 | 20 | 64 | 224 | ||
Aspirated | 51 | 40 | 12 | 40 | 44 | 187 | ||
Fricatives | Voiced | 36 | 36 | |||||
Voiceless | 36 | 36 | ||||||
KL | Stops | Voiced | 80 | 36 | 40 | 24 | 48 | 228 |
Voiceless | 72 | 40 | 40 | 28 | 56 | 236 | ||
Aspirated | 60 | 40 | 12 | 8 | 40 | 160 | ||
Fricatives | Voiced | 36 | 36 | |||||
Voiceless | 36 | 36 | ||||||
SD | Stops | Voiced | 80 | 40 | 36 | 28 | 48 | 232 |
Voiceless | 76 | 40 | 31 | 28 | 56 | 231 | ||
Aspirated | 52 | 40 | 12 | 8 | 40 | 152 | ||
Fricatives | Voiced | 36 | 36 | |||||
Voiceless | 36 | 36 | ||||||
Totals | Stops | Voiced | 404 | 192 | 199 | 140 | 244 | 1,179 |
Voiceless | 357 | 204 | 189 | 133 | 296 | 1,179 | ||
Aspirated | 284 | 200 | 59 | 72 | 204 | 819 | ||
Fricatives | Voiced | 181 | 181 | |||||
Voiceless | 182 | 182 |
-
Note. Organized according to manner, place of articulation and voice type for each speaker.
In total, 730 tokens were analysed for KT, 720 for CH, 696 for KL, 707 for DM, and 687 for SD.[14] The minor discrepancies in total tokens recorded per speaker should not have a tangible effect on the results of this study. The larger imbalances found in the retroflexes and palatals are discussed in relation to their results.
Recordings done in 2009 were made using a Shure brand, head mounted microphone, placed approximately 3 cm from the speaker’s mouth using a Marantz PMD 660 recorder. A Zoom H4N recorder was used for the recordings of the other three speakers (KL, SD, and DM), using the internal microphone. We opted against an external microphone with this recorder as it was found to actually reduce the quality of the recording.[15] Both recordings were made at a 24-bit 96 kHz sampling rate as .wav format. Acoustic analysis was completed using Praat (Boersma and Weenick 2014) acoustic software.
2.3 Acoustic measurements
VOT was measured by hand between the release and the first voicing cycle (Lisker and Abramson 1964). Occasionally frication, particularly at the palatal and velar places of articulation, was included in the VOT measurements of the stops. The effects of this are considered in the discussion. Voicing for fricatives was determined primarily by the presence of a voice bar in the spectrogram, glottal pulsations in the spectrogram and often in the wave form, and periodic wave forms (as opposed to the aperiodic voiceless wave form). Duration was measured for the fricatives since Stevens et al. (1992) showed that duration must exceed 60 ms for voicing (particularly relating to voicelessness) to be perceived.[16] Vowel duration was measured by hand according to guidelines in Wright and Nichols (2009).
The fundamental frequency (f0) was measured at nine equidistant points on the vowel using a Praat script.[17] Since the voicing and sonority of the prevocalic consonant can have a contour effect on the pitch levels (Hombert et al. 1979), the approximate mid-point (interval 4) was used in statistical analysis. Fundamental frequency at the onset (interval 1) and the end of the vowel (interval 9) were also statistically assessed to test if the contrast was maintained across the duration of the vowel.
3 Results for dental fricatives
Recall that Kurtöp synchronically contrasts voiceless and voiced dental fricatives at the phonological level, following a recent merger of the voiced and voiceless palatal fricatives in favour of a tonal contrast. This acoustic study also explored voicing of fricatives and following f0 as indicators of ongoing tonogenesis. A total of 182 phonologically voiceless dental fricatives and 181 phonologically voiced dental fricatives across all speakers (around 36 tokens for each speaker in each category) were acoustically analysed.
We were interested in determining whether the primary contrast is still made in the acoustic domain of voice or whether it has now transphonologised into the domain of f0 on the following vowel. Ideally, this question would be addressed through a multi-pronged approach that involved production as well as perception. However, we are limited in the current study through our query of production data alone. As a shorthand, we assume that statistical significance in one domain is indicative of the location of primary phonological contrast. That is, if we find voicing to be the statistically significant category then we assume that voicing is still the primary phonemic contrast. On the other hand, if we find that f0 is the statistically significant category, we assume that tonogenesis has completed for that particular word and that the primary phonemic contrast is now found in the domain of tone, rather than voicing. Using statistical significance as a proxy for phonemic contrast may be imperfect, but we argue that nonetheless this methodology will yield interesting observations about the pathway tonogenesis has taken in Kurtöp.
We found that: 1) low tone following the voiced fricatives and high tone following the voiceless fricatives are statistically distinct categories for all five speakers; 2) these tonal categories (high and low) are maintained across the entire duration of the vowel for three of the five speakers: and 3) the voicing distinction in fricatives is collapsing. These results suggest that the voicing contrast in dental fricatives is merging in favour of a contrast in tone on the following vowels, as has already been found for the palatal fricative. Given the small number of subjects (five), we discuss individual speaker differences as part of the presentation of results. However, bearing in mind that generalisations due to factors such as age and gender will be impossible with such small numbers, we will abstain from making such observations.
3.1 Fundamental frequency
Figures 2 –6 represent the mean f0 of the vowel following the voiced and voiceless dental fricatives for each of the five speakers recorded.

Mean f0 from 75 tokens of vowels following dental fricatives for speaker CH.

Mean f0 from 72 tokens of vowels following dental fricatives for speaker DM.

Mean f0 from 72 tokens of vowels following dental fricatives for speaker KL.

Mean f0 from 72 tokens of vowels following dental fricatives for speaker KT.

Mean f0 from 72 tokens of vowels following dental fricatives for speaker SD.
Figures 2 –6 show that the difference in mean f0 following the voiced and voiceless series is maintained across the duration of the vowel for all speakers, with the exception of speaker SD. Figure 6 shows that for SD the difference between mean f0 appears to taper towards the end of the vowel (from around 70 %). To query whether the mean f0 following the voiceless dental fricative and that of the voiced dental fricative are distinct categories representing tone (high and low respectively) a linear mixed model was run, with f0 at interval 4 as the outcome, speaker as the random effect, and voice type (voiced or voiceless) as the fixed type. Figure 7 shows a boxplot of the mean f0 at interval four for voiced versus voiceless fricatives, by speaker. Note also that the variation in pitch appears much smaller following the voiced than the voiceless category; this would be expected if the primary contrast of the voiced category had moved into the domain of tone.

Clustered boxplot of mean f0 at interval four by speaker and voice type.
The estimated means are shown in Table 7; note that the f0 following phonologically voiceless fricatives is nearly 30 Hz higher than that of the phonologically voiced fricatives.
Mean f0 at interval four following voiceless versus voiced fricatives for all speakers.
Estimates | ||||
---|---|---|---|---|
Voice type | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
V | 183.825 | 44.950 | 95.584 | 272.066 |
VL | 210.468 | 44.930 | 122.267 | 298.670 |
Tables 8 and 9 show that this difference in interval 4 f0 is statistically distinct for the two categories.
Linear mixed model results, fixed coefficients.
Fixed coefficientsa | ||||||
---|---|---|---|---|---|---|
Model term | Coefficient | Std. Error | T | Sig. | 95 % confidence interval | |
Lower | Upper | |||||
Intercept | 210.468 | 44.9300 | 4.684 | <0.001 | 122.267 | 298.670 |
VoiceType = V | −26.644 | 1.84 | −14.480 | 0.000 | −30.256 | −23.031 |
VoiceType = VL | 0b | - | - | - | - | - |
-
Probability distribution: Normal. Link function: Identitya. aTarget: Pitch 4. bThis coefficient is set to zero because it is redundant.
Linear mixed model results, fixed effects.
Fixed effectsa | ||||
---|---|---|---|---|
Source | F | dF1 | Df2 | Sig. |
Corrected model | 209.667 | 1 | 758 | 0.000 |
VoiceType | 209.667 | 1 | 758 | 0.000 |
-
Probability distribution: Normal. Link function: Identitya. aTarget: Pitch 4.
3.2 Voicing
We measured closure duration of each fricative and also coded each for presence or absence of voicing. Stevens et al. (1992) reported that voicing in fricatives can only be perceived if the duration is more than 60 ms. In our study, Kurtöp voiceless fricatives had an average closure duration of 143 ms while the closure duration of voiced fricatives was 131 ms; thus there is ample duration to encode a voicing contrast. Voicing was determined by the presence of a voice bar in the spectrogram, glottal pulsation in the spectrogram and wave form, and aperiodic versus periodic wave forms. This was usually an easy and straightforward process, illustrated by the difference shown in Figures 8 and 9 below.

Voiced and voiceless realisations of Ch’s pronunciation of zi ‘cat eye’. Note the salient voice bar associated with the fricative in 8, but completely absent in 9. 8 also shows a periodic wave form going into the vowel while 9 shows only aperiodic noise prior to the onset of the vowel.
For the vast majority of tokens, the decision to label a fricative as voiced or voiceless was straightforward. Very occasionally, the voicing would be less easy to determine, as in Figure 10. Here, the beginning of the wave form appears periodic but then is aperiodic for the majority of the closure duration. This voicing could be due to the fact that this token was the iteration from the carrier phrase; thus the voicing at the beginning could be the trailing off of the preceding vowel (as in Figure 10). Even more rarely, voicing sometimes started halfway through the fricative and before the start of the vowel, as in Figure 11. In these ambiguous cases, we labelled a sound as voiced if more than 50 % of the duration was voiced while it was labelled voiceless if less than 50 % of the duration was voiceless. In reality, these ambiguous cases comprise a very small proportion of the data set; for example, of KT’s token’s, only two were somewhat ambiguous; these are shown in Figures 10 and 11.

KT voiceless iteration of zon ‘two’ and voiced iteration of ze ‘substance’.
By isolating the phonemically voiced dental fricatives, we can look at percentage of tokens with the presence of voicing (voiced) versus without voicing (voiceless). Overall, the phonemic voiced dental fricatives have a larger portion (62 %) of voiceless tokens (lacking the presence of a voice bar or glottal pulsations) than voiced tokens (38 %). This suggests that the voice distinction in the voiced dental fricatives is merging with the voiceless dental fricatives at a faster rate than for the stops.
Looking at the individual speakers in terms of percentages of total tokens with the presence of a voicing (voiced) and without voicing (voiceless) as in Figure 12, it becomes clear that two speakers (DM and KT) have nearly lost the voicing distinction in the dental fricatives.

Percentage of total phonetically voiceless and voiced tokens for each speaker for phonemically voiced dental fricatives.
Figure 12 shows that for DM around 95 % of voiced dental fricatives are realised as voiceless. For speaker KT this is closer to 80 %. Over 60 % of CH’s tokens are realised as voiceless, while speaker KL appears to be just the opposite. The only other speaker, besides KL, to have more tokens realised as voiced than tokens realized as voiceless is speaker SD with a nearly 50/50 distribution of his phonemically voiced fricatives.
4 Results for place of articulation of stops
We saw that Kurtöp stops are phonologising tone on their following vowel (voiced > low tone; voiceless and aspirated > high tone) and the voicing contrast in the fricatives is in the process of merging, with the phonologically voiced series often being realised with a VOT typical of the phonologically voiceless series.
4.1 Stops as a whole
Prior to getting into the detailed results for each place of articulation, we will look briefly at the stops as a whole. The results in general corroborate the results found in Hyslop (2009), with minor differences. The low tone following voiced stops and the high tone following voiceless stops are maintained at a statistically significant difference across the entire duration of the vowel and the voiced categories of stops is merging with the voiceless category.
4.1.1 F0 on following vowels
We see that the f0 following the voiceless stop and the f0 following the aspirated stop are significantly distinct for all speakers at the near midpoint of the vowel (interval 4). When looking at the onset and end positions of the vowel, the mean f0 following the voiceless and aspirated stops is not statistically significant for one speaker (KT) at the onset (interval 1), and for three speakers (CH, KT SD) at the end of the vowel (interval 9). The means and standard deviation of the VOT for all three voice types (voiced, voiceless, aspirated) were proven to be statistically significant categories for all five speakers. The standard deviation of the voiced series was much higher than the other two voicing categories, which supports Hyslop’s (2009) proposal that there is an ongoing merger between the voiced and voiceless series and thus there is a great deal of variation in realisation of VOT within the phonemically voiced category.
Figures 13–17 show the f0 on the vowel following voiced, voiceless, and voiceless aspirated stops for each speaker.

Mean f0 from 645 tokens of vowels following stops for speaker CH.

Mean f0 from 635 tokens of vowels following stops for speaker DM.

Mean f0 from 624 tokens of vowels following stops for speaker KL.

Mean f0 from 658 tokens of vowels following stops for speaker KT.

Mean f0 from 615 tokens of vowels following stops for speaker SD.
We see a difference in f0 on vowels with a lower f0 following the voiced stops and a higher f0 following the two voiceless series (voiceless and voiceless aspirated). For speakers KL and DM there appears to be a difference in high f0 between the aspirated and voiceless series; however, this difference is less obvious for the remaining three speakers (CH, KT, and SD).
At a glance, the f0 differences appear substantial for all speakers at all three points in the vowel. Note that for all three points (onset, mid point, end point), KT exhibits the lowest amount of difference between means. Interestingly, as noted visually from Figure 13, CH has the highest degree of difference at the onset of the vowel and the third lowest at the end of the vowel, which suggests that CH exhibits the highest degree of variation in mean f0 differences across the duration of the vowel. All of these differences will be explored through the following statistical analysis.
The mean f0 measured at the midpoint of the vowel (interval 4) increased following voiced, aspirated, and voiceless stops, in that order, for all five speakers. The means are given in Table 10, showing that mean f0 following voiced stops were consistently the lowest across all the speakers.
Summary of mean F 0 following stops at midpoint of the vowel.
Speaker | Voice type | N | Mean | S.D |
---|---|---|---|---|
CH | Voiced | 236 | 258.84 | 15.35 |
Aspirated | 157 | 295.66 | 17.51 | |
Voiceless | 252 | 306.88 | 21.99 | |
DM | Voiced | 236 | 217.72 | 9.05 |
Aspirated | 163 | 251.25 | 27.89 | |
Voiceless | 236 | 268.29 | 22.77 | |
KL | Voiced | 228 | 132.91 | 6.58 |
Aspirated | 160 | 162.98 | 13.87 | |
Voiceless | 240 | 175.90 | 17.39 | |
KT | Voiced | 247 | 117.84 | 8.88 |
Aspirated | 187 | 140.77 | 14.09 | |
Voiceless | 224 | 145.75 | 14.36 | |
SD | Voiced | 232 | 158.46 | 11.84 |
Aspirated | 152 | 188.69 | 13.35 | |
Voiceless | 231 | 195.81 | 16.57 |
-
Note. Number of tokens, mean and standard deviation for F 0 on the midpoint (interval 4) for all speakers within each voicing type.
Statistical analysis confirmed the observation above via a Linear Mixed Model; results are shown below in Table 11, examining f0 (interval four). The model is fitted for an interaction of term between voice type and place of articulation, keeping subject as a random effect.
Linear mixed model results.
Fixed effectsa | ||||
---|---|---|---|---|
Source | F | dF1 | Df2 | Sig. |
Corrected model | 353.096 | 14 | 3,165 | 0.000 |
VoiceType * PoA | 17.314 | 8 | 3,165 | 0.000 |
VoiceType | 1991.088 | 2 | 3,165 | 0.000 |
PoA | 28.002 | 4 | 3,165 | 0.000 |
-
Probability distribution: Normal. Link function: Identitya. aTarget: Pitch 4.
We can confirm that the mean f0 following stops exhibits statistically significant categories that are connected to the voice type of the preceding stop. For further tests at different places across the vowel, the reader is referred to Plane (2016).
Figure 18 is a visual representation of the mean f0 in interval four as a function of voice type; note again that we see more variation in f0 following voiceless and aspirated stops than following the voiced category. We will return to this issue later.

Mean f0 in interval four as a feature of phonological voicing type.
4.1.2 VOT
Mean and standard deviation of VOT for voiced, voiceless, and voiceless aspirated tokens were also calculated for all speakers and the results are given in Table 9.
Looking at Table 12, mean scores of all three categories appear quite different for all speakers. For every speaker, the mean increases according to voice type, with voiced being the lowest and aspirated being the highest. Notice that DM’s mean score for voiced series (−103.60) is well below any of the other speakers, yet DM’s voiced series range (−257 ms to 83 ms) is similar to CH’s and SD’s voiced series range [CH (−250 ms to 96 ms) and SD (−250 ms to 110 ms)].
VOT summary for stops for all speakers.
Speaker | Voice type | N | Mean | S.D | Range |
---|---|---|---|---|---|
CH | Voiced | 236 | −48.01 | 58.15 | −250 to 96 |
Voiceless | 252 | 16.15 | 10.89 | 1–65 | |
Aspirated | 157 | 40.96 | 18.05 | 6–91 | |
DM | Voiced | 236 | −103.60 | 84.0 | −257 to 83 |
Voiceless | 236 | 33.0 | 25.21 | 3–112 | |
Aspirated | 163 | 68.11 | 31.51 | 4–156 | |
KL | Voiced | 228 | −17.58 | 54.83 | −154 to 55 |
Voiceless | 240 | 21.55 | 14.06 | 6–77 | |
Aspirated | 160 | 57.26 | 18.87 | 24–102 | |
KT | Voiced | 247 | −21.64 | 62.86 | −168 to 84 |
Voiceless | 224 | 31.0 | 15.06 | 3–80 | |
Aspirated | 187 | 80.0 | 23.76 | 30–154 | |
SD | Voiced | 232 | −40.83 | 56.42 | −250 to110 |
Voiceless | 231 | 21.71 | 16.38 | 1–68 | |
Aspirated | 152 | 55.85 | 21.85 | 18–150 | |
Total | Voiced | 1,179 | −46.3 | 71.14 | −257 to 110 |
Voiceless | 1,190 | 24.51 | 18.09 | 1–112 | |
Aspirated | 819 | 61.26 | 26.88 | 4–156 |
-
Note. Number mean standard deviation and range values are shown for each speaker and each voice type.
Statistical analysis confirmed that all three voice types are statistically significant categories for syllable-initial stops. Homogeneity of variance was violated, as assessed by Levene’s Test of Homogeneity of Variance (p < 0.001) for all speakers. Welch’s ANOVA confirmed the statistical significance for all speakers: for speaker CH [F(2,313.235) = 280.884, p < 0.001], for speaker DM [F(2,361.272) = 411.688, p < 0.001], for speaker KL [F(2,338.836) = 294.567, p < 0.001], for speaker KT [F(2,373.547) = 424.548, p < 0.001], and for speaker SD [F(2,335.578) = 311.489, p < 0.001]. A Games-Howell post hoc test confirmed (p < 0.001) the statistical significance for all speakers for each of the three paired comparisons of the three voicing types.
Turning now only to the phonologically voiced category, we can see the individual speaker differences in Figure 19.

Phonologically voiced stops with positive and negative VOTs for each speaker.
Figure 19 shows that speaker CH has a 60 % negative, 40 % positive distribution of stops, identical to the overall results. Both DM and SD have a larger percentage of negative tokens (80 % and 70 % respectively). Speaker KT is nearly split 50/50, while speaker KL has the reverse of CH and the overall results shows 40 % negative and 60 % positive tokens.
The scatter plot in Figure 20 represents visually distinct categories between the low f0 following voiced stops and the higher f0 following the voiceless and aspirated stops, with their correlation to VOT.

Relationship between VOT and f0 following stops for each speaker.
The blue series represents the voiced stops. Note that they occupy the left side of the graph (the negative VOT), while still blending with the voiceless stops (green series) on the positive VOT portion of the graph. Also note that while the series merges on the positive side of the VOT, the voiced series is lower in f0 than the voiceless series. The voiceless series appears to have a larger f0 range, while the voiced stops seem to have a narrow f0 range with a larger VOT range.
4.2 Place of articulation differences
With the merger between voiceless and voiced stops in favour of tone clearly underway, we can now turn to the differences for each place of articulation. To determine if there could be a trading relationship between the devoicing of a phonologically voiced stop and pitch on the following vowel, we plotted f0 in interval four, (on the y-axis) as a function of VOT (on the x-axis) for each voicing category for each speaker in Figures 21–25.

f0 and VOT for the stops varying in voicing category and place of articulation for each speaker.
There are several things to note in the above figures. First, if we look at the distribution of the voiceless (green) and aspirated (pink) VOT values, we see the expected distribution on the right side of the X-axis, showing positive VOTs. Further, despite some overlap, the two categories have a fairly uniform f0 distribution. We see more speaker variation within the phonologically voiced series. Blue dots to the left of the 0 indicate pre-voicing and those blue dots produced to the right of the 0 are produced as voiceless acoustically.
We can make a further observation regarding the phonologically voiced category. When these tokens are realised with pre-voicing, we see a large degree of variation in the realization of f0 on the midpoint of the following vowel. For example, f0 on the following vowels of pre-voiced initials for Ch (upper left) ranges from below 250 Hz to approximately 350 Hz and f0 on the following vowels of pre-voiced initials for speaker KT (middle left) ranges from just below 100 Hz to approximately 175 Hz. On the other hand, when these phonologically voiced tokens are realised as voiceless, the f0 variation on the following vowel disappears. This is true for all speakers, but perhaps most pronounced for KL (middle right). If we look in particular at KL’s retroflex tokens that are phonetically voiceless, we see that the f0 space for each different voicing type (voiceless, aspirated, voiced) is distinct. KL’s phonetically voiceless but phonologically voiced retroflexes have a coherent f0 on their following vowel with just below 150 Hz, whereas the f0 on the vowels following the phonologically voiceless retroflexes is closer to 200 Hz. This is true for all speakers and the very strong evidence of f0 correlations discussed above can be taken to be directly related to the devoicing of the phonologically voiced series. Note, importantly, that we sometimes get low f0 without concomitant devoicing. That is, we can have phonetically voiced segments with low ensuing pitch, but we always have low pitch following a phonetically devoiced segment. This suggests that pitch phonologises before the voicing is lost.
Figures 21–25 also illustrate the drastic differences between the retroflexes and labials. If we take KL’s (middle right) retroflexes, for example, we see that the voiceless and aspirated categories have fairly uniform VOT realisations together with distinct f0 realisations on the following vowel. There are only a few phonetically voiced retroflex tokens; all of these have low f0 realised on their following vowels.
On the other hand, we can look at labials for speakers such as CH, KT, and SD, and notice the preponderance of acoustically voiced realisations and with f0 realisations on following vowels that appear to span the extent of the f0 space for each speaker. There are some voiceless realisations, all with low f0 on the following vowel, but these are a minority of the tokens. The dentals are similarly more frequently voiced and have a range of f0 realisations on following vowel. We can also see in these figures that the palatals and velars appear to be an intermediate category between the retroflexes and labials/dentals. Interestingly, we also see what appears to be a near bimodal distribution in the f0 realisation following the phonemically voiced category of velars for KT and, slightly so, for DM.
Given the expectation that phonologically voiced stops would be devoicing in favour of a tonal contrast, we now turn to the voiced results alone. Figure 25 provides the percentages of tokens that have positive VOT and those that have negative VOT in the voiced series, organized by each speaker at each place of articulation. Figure 26 also shows several noteworthy developments. First, CH and DM do not have any places of articulation with a higher percentage of positive VOT than negative VOT. Second, KL has more positive than negative VOT tokens in all places of articulation except labial. Third, going from front to back, the number of speakers that have more positive VOT than negative VOT are as follows: labial has zero of five speakers, dental has one of five speakers (with KT approaching 50/50), retroflex has three of five speakers (with speaker CH approaching 50/50), palatal has two of five speakers (with SD approaching 50/50), and velar has two of five speakers.

Percentage of total positive and negative VOT tokens for each speaker at each place of articulation.
4.2.1 Labial
The histogram in Figure 27 shows the VOT of the voiced labial stops for each speaker.

Distribution of VOT for voiced labial stops for each speaker.
There is a bimodal distribution in stops for most of the speakers (speakers DM and SD both have some positively realised labial stops but have less pronounced modes compared to the other speakers). The above data show that the voiced labial stops have both negative and positive VOT for all speakers. In order to test whether or not there is a trade off between phonetic voiced and pitch on the following vowel, we ran a linear mixed model, keeping speaker as a random effect. The estimates are shown in Table 13. These two categories are not statistically significant (p = 0.336).
Estimates for interval 4 f0 as a function of phonetic voicing.
Estimates | ||||
---|---|---|---|---|
VOT binary | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
0 | 175.712 | 43.408 | 90.377 | 261.047 |
1 | 174.499 | 43.417 | 89.146 | 259.851 |
4.2.2 Dental
Figure 28 provides visual representation of the VOT distribution of the voiced dentals.

Distribution of VOT for voiced dental stops for each speaker.
The means for voiced dental stops are negative for all speakers; however, a few speakers have smaller negative means than those found in the labial stops. Both KT’s voiced dental mean (−37.75) and KL’s (−8.08) are smaller than the smallest voiced mean in the labial series (where the smallest was speaker KT at −44.51). The range column shows that there is overlap between the voiced and voiceless series (as well as the voiceless and aspirated).
It appears that fewer speakers have a clear bimodal distribution than seen in the labials. DM and SD both have few positive tokens. KT appears to have a similar bimodal distribution as in the labials, but perhaps slightly more skewed towards the positive end in the dentals. CH has a less pronounced mode in the positive end than was represented by the labials. One speaker (KL) has a much more pronounced mode in the positives than in the negatives.
In order to ascertain whether or not there is a trade off between phonetic voicing and pitch on the following vowel, we ran a linear mixed model, keeping speaker as a random effect. The estimates are shown in Table 14. These two categories are not statistically significant (p = 0.206).
Estimates for interval 4 f0 as a function of phonetic voicing; 0 = phonetically voiced; 1 = phonetically voiceless.
Estimates | ||||
---|---|---|---|---|
VOT binary | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
0 | 178.945 | 41.972 | 96.153 | 261.736 |
1 | 176.542 | 41.988 | 93.720 | 259.364 |
4.2.3 Retroflex
Isolating the voiced series and looking at the distribution in Figure 29, it appears that for speakers KL, KT, and SD, the voiced retroflexes have almost completely merged with the voiceless series. This is not surprising, as it was already discussed that there was not a statistical difference between these two series for two of these speakers (KL and SD).

Distribution of VOT for voiced retroflex stops for each speaker.
In order to ascertain whether or not there is a trade off between phonetic voicing and pitch on the following vowel, we ran a linear mixed model, keeping speaker as a random effect. The estimates are shown in Table 15. This time the difference is statistically significant (Table 16), suggesting that tone on the following vowel, rather than voice, is the primary contrast amongst retroflex stops.
Estimates for interval 4 f0 as a function of phonetic voicing; 0 = phonetically voiced; 1 = phonetically voiceless.
Estimates | ||||
---|---|---|---|---|
VOT binary | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
0 | 181.727 | 38.946 | 104.921 | 258.532 |
1 | 176.395 | 38.937 | 99.607 | 253.182 |
Statistical significance of f0 at interval four, as a function of phonetic voicing.
Fixed effectsa | ||||
---|---|---|---|---|
Source | F | dF1 | Df2 | Sig. |
Corrected model | 12.450 | 1 | 197 | <0.001 |
VOTbinary | 12.450 | 1 | 197 | <0.001 |
-
Probability distribution: Normal. Link function: Identitya. aTarget: Pitch 4.
4.2.4 Palatal
Figure 30 offers an isolated look at the voiced palatal series for all speakers.

Distribution of VOT for voiced palatal stops for each speaker.
Again there is a positive mean in the voiced series. For speakers DM and SD, the voiceless and aspirated series were not distinct categories, though this could be a result of the small sample size of the aspirated series (8 tokens for all speakers except KT who had 40 tokens). Focussing on the voiced series, KL has the smallest standard deviation (52.82) and DM has the largest (108.75), among any of the places of articulation, thus far (108.75). DM also has the largest negative mean of the voiceless series.
Looking at Figure 30, all speakers have some tokens that are voiced with positive VOT. KT and SD have distinct bimodal distributions. CH and DM have less pronounced modes in the positive range, but still appear to have a bimodal distribution. KL appears to favour the positive VOT realisation and the few tokens that are realised with negative VOT appear to be the exception. This suggests that KL has nearly merged the voicing distinction between voiced and voiceless palatal stops.
In order to ascertain whether or not there is a trade-off between phonetic voicing and f0 on the following vowel, we ran a linear mixed model, keeping speaker as a random effect. The estimates are shown in Table 17. The difference is statistically significant, as shown in Table 18.
Estimates for interval 4 f0 as a function of phonetic voicing; 0 = phonetically voiced; 1 = phonetically voiceless.
Estimates | ||||
---|---|---|---|---|
VOT binary | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
0 | 181.783 | 38.913 | 104.840 | 258.726 |
1 | 175.730 | 38.917 | 98.779 | 252.681 |
Statistical significance of f0 at interval four, as a function of phonetic voicing.
Fixed coefficientsa | ||||||
---|---|---|---|---|---|---|
Model term | Coefficient | Std. Error | t | Sig. | 95 % confidence interval | |
Lower | Upper | |||||
Intercept | 175.730 | 38.9170 | 4.516 | <0.001 | 98.779 | 252.681 |
VOTbinary = 0 | 6.053 | 2.2788 | 2.656 | 0.009 | 1.548 | 10.559 |
VOTbinary = 1 | 0b | - | - | - | - | - |
-
Probability distribution: Normal. Link function: Identity. aTarget: Pitch 4. bThis coefficient is set to zero because it is redundant.
4.2.5 Velar
Figure 31 presents the distribution of the voiced series VOT.

Distribution of VOT for voiced velar stops for each speaker.
The distribution of the voiced velar series in Figure 31 shows a considerable trend towards positive VOT realization of these stops. KT and KL once again have merged more of their tokens with the voiceless series, while SD, DM and CH appear to have more voiced VOT token but also have sizable representation of the merger with the voiceless series.
In order to ascertain whether or not there is a trade-off between phonetic voiced and pitch on the following vowel, we ran a linear mixed model, keeping speaker as a random effect. The estimates are shown in Table 19. The difference is not statistically significant (p = 0.198).
Estimates for interval 4 f0 as a function of phonetic voicing; 0 = phonetically voiced; 1 = phonetically voiceless.
Estimates | ||||
---|---|---|---|---|
VOT binary | Mean | Std. Error | 95 % confidence interval | |
Lower | Upper | |||
0 | 176.257 | 38.849 | 99.731 | 252.783 |
1 | 178.633 | 38.853 | 102.099 | 255.167 |
5 Discussion
5.1 Voiced obstruents
Given that we argue the voiced obstruents are currently undergoing tonogenesis and merging with the voiceless series, in favour of a tonal contrast following vowels, it is worth considering the voiced results in detail; the issue of cue-trading in phonologisation becomes especially relevant at this stage in tonogenesis. We have shown that both voicing and f0 are utilised in making a contrast amongst Kurtöp obstruents and within the phonologically voiced category there is a particularly high variation in how the ‘voicing’ is realised. We further argue that the devoicing of the obstruents is motivated by the transphonologisation of f0 as tone on the following vowel.
In order to more confidently assert this, we have plotted f0 at the approximate mid point (interval 4) as a function of presence or absence of voicing amongst the phonologically voiced stops. Due to the high amount of variation and relatively small numbers of tokens for some categories for some speakers, we examined f0 (interval four) as a function of phonetic voicing for each speaker individually, rather than aggregating across all speakers. These results are shown in Figure 32. Unfortunately, the small number of tokens and high variability amongst DM in particular means a comparison of fricatives versus stops as a whole is not possible. Nonetheless, we can see for at least speakers SD and KL, there is less variation in pitch with the phonetic voiceless realisations.

f0 as a function of voicing across phonologically voiced fricatives for each speaker. Error bars represent confidence interval for mean; 0 = phoneticlly voiced; 1 = phonetically voiceless.
We also plotted f0 at interval four as a function of phonetic voicing amongst stops, looking at each place of articulation separately. These results are summarised for all speakers in Figure 33.

f0 as a function of voicing across phonologically voiced stops for each place of articulation across all speakers; 0 = phonetically voiced; 1 = phonetically voiceless.
Examining approximate mid point f0 on vowels following phonetically voiced versus phonetically voiceless stops for each place of articulation confirms the observation made above; tonogenesis appears to be happening for some places of articulation at a faster rate than for others. Specifically, we see that f0 is lower following phonetically voiceless retroflexes, palatals, velars and dentals, while pitch following labials appears relatively equal, regardless of whether or not there is phonetic voicing.
These figures further illustrate the cue-trading relationship between VOT and f0 as part of the tonogenesis process. Schertz and Clare (2020: 5) argue that when change is in progress, some speakers will rely more on the ‘innovative’ cue while others rely more on the traditional cue. In our case, the ‘traditional’ cue is VOT and the innovative one is f0. We can see that speakers do indeed do different things with these cues, but also, crucially important here, that the cues are utilised differently at different places of articulation.
5.2 Obstruents as a whole
These results have shown us several things. First, f0 is statistically higher following phonologically voiceless obstruents than when following phonologically voiced obstruents, fitting in with the received model of tonogenesis in which voiceless onsets condition high tone on following vowels and voiced onsets condition low tone on following vowels. Second, we have seen that phonologically voiced obstruents are very often realised as phonetically voiceless, again following the expected tonogenetic developments.
Considering these results from a lens of transphonologisation, we see that while both f0 and voicing (either as VOT or presence of voicing during fricative closure) are both still present as part of the acoustic bundle of features marking a phonemic contrast in Kurtöp, there appears to be a shift in importance. F0 is statistically significant on the following vowels overall, suggesting that this is a primary feature of the contrast in production. However, the voicing results showed high variability, suggesting that for some place/manners of articulation voicing is a more prominent feature of the acoustic bundle than for other places. In places where voicing is less important, we make the inference that tone has become primary and thus devoicing can now happen. The transphonologisation from a contrast in voice on obstruents to contrast in tone on following vowels is still underway for all obstruents but palatal fricatives. However, the different weights attributed to the different cues for the different places and manners strongly suggests transphonologisation will be completed for some places and manners before others.
Stepping back to look at the bigger picture, we can fit the fricatives into the larger tonal pathway of the language. As has been established, tone has first entered Kurtöp following the complex onsets in which the second member was a nasal consonant. With tone contrastive following the nasals, predictable pitch must have developed following obstruents, in particular the voiceless and voiced palatal fricatives.
The results presented in Section 3 and 4.1 suggest that the fricatives are further along in the process of transphonologisation than the stops as a whole. Specifically, Section 3 showed that there is a statistically significant distinction in mean f0 following the voiced dental fricatives (low) and the mean f0 following the voiceless dental fricatives (high). These results suggest that high and low tones on the vowel are distinct categories following the voiceless and phonologically voiced dental fricatives, respectively. The results were statistically significant for three speakers (DM, KL, KT) in all three tested points along the vowel (onset, midpoint, and endpoint), and were statistically significant for two of the speakers (SD and CH) at the onset and midpoint, but not at the endpoint. In particular, the visual representation of SD’s mean f0 was atypical compared to the other speakers. It is unclear what motivations caused SD’s results to be so incongruous with the other speakers.
The evidence presented here supports the prediction by Hyslop (2009) that all obstruents (not just stops) are merging the voicing distinction in favour of a tonal contrast. Further, the fricatives are merging more quickly than the stops. This is summarised in Figures 34 and 35, showing that phonologically voiced stops are devoiced approximately 40 percent of the time while phonologically voiced fricatives are devoiced over 60 percent of the time.

Percentage of phonetically voiceless versus voiced realisations of phonologically voiced stops (left) as opposed to fricatives (right).
Kurtöp’s five places of articulation provide an ideal location to query the role of place of articulation in the propagation of tone through a language. Overall, we saw that speakers favoured devoicing of the retroflexes over the other four places of articulation. We are unsure precisely why this is the case; however, we can put forward some hypotheses. As Hyslop (2014, 2017 outlines, Kurtöp retroflexes have recently entered the language through the simplification of velar plus rhotic onset clusters.[18] In fact, for some speakers they might be better described as doubly-articulated segments, often with a strong, separate rhotic realisation. A summary of words showing the Tibetan source of retroflexes in Kurtöp is shown in Table 20.[19]
Correspondences of Kurtöp retroflexes with Written Tibetan complex onsets involving rhotics.
Written Tibetan | Central Tibetan pronunciation | Kurtöp | Gloss |
---|---|---|---|
<sgro> | ɖò | ɖò | ‘feather’ |
<sgra> | ɖɐ̀ | ɖɐ̀ | ‘pronunciation’ |
<sgru> | ɖù | ɖù | ‘boat’ |
<grub> | ʈʰùp | ɖùp | ‘house completion’ |
<drel> | ʈʰè | ɖè: | ‘mule’ |
<drilbu> | ʈʰìbu | ɖìbu | ‘bell’ |
<dkrug> | ʈú: | ʈúk ∼ ʈú: | ‘stir’ |
<khri> | ʈʰí | ʈʰí | ‘throne’ |
<khrom> | ʈʰóm | ʈʰóm | ‘market’ |
<krung-krung> | ʈúŋ-túŋ | ʈúŋ-túŋ | ‘crane’ |
<hbrug> | ɖù: | ɖù: | ‘dragon’ |
Given that tone first entered in Kurtöp following complex onsets and that complex onset simplification also seems to be happening as a slow diachronic process in Kurtöp Hyslop (2017), it seems possible that the retroflexes are still phonologically – in some senses – complex onsets (note also that they do not occur in coda position). Hence, if we re-class retroflexes as complex onsets with sonorant second members, we might expect them to behave more akin to the complex onsets with nasal second members and thus be one of the first places to condition the phonologisation of tone. Additional hypotheses to explain the preference to devoice retroflexes include intrinsic closure duration differences[20] and social factors.[21]
Following the retroflexes, we found that the velars and then the palatals were next likely to devoice. There is, perhaps, a physiological explanation for this. Ohala and Riordan (1979) and Ohala (2012) point out that it is harder to maintain voicing in velars than it is in bilabials, a fact which is mirrored in phenomena such as the ‘velar gap’, a tendency for a voiced consonant, when missing from an otherwise balanced stop inventory, to be a voiced velar. Conversely then, we find that dentals and especially bilabials are much less likely to devoice than the other places of articulation. More research is needed to explain why back places of articulation would be more likely to devoice than front places, perhaps through the physiological motivations mentioned above.
5.3 Typological considerations
The Kurtöp findings can be compared and contrasted with other languages. Tone is reported to enter Tai (Pittayaporn 2009) and Tibetan (Mazaudon 1977) following the most sonorous segments first. Rhotics alone have conditioned tonogenesis in the Phnom Pehn dialect of Khmer tonogenesis (Wayland and Guion 2005). Hyslop (2022) presents a typology of tonogenesis, showing that in fact such importance on sonorants in the development of tone is typologically more common than previously expected.
Afrikaans (Coetzee et al. 2018) also has incipient tonogenesis. Due to the collapse of a VOT contrast in initial position, f0 has increasingly been observed to carry more weight perceptually and is also now being produced by speakers. Note that Coetzee et al. (2018) also mention that the labial place of articulation, in particular, relies more on voicing in perception for older speakers. Shryock’s (1995) acoustic study of the Chadic language Musey finds that overall, stops are more often produced as voiceless than fricatives, and non-laryngeal fricatives are more often voiced than [h] (17). Fricatives are also reported to play a privileged role in tonogenesis described for Athabaskan (Kingston 2007), Balsas Nahuatl (Guion et al. 2009) and Kickapoo (Gathercole 1983). Thus there is some typological evidence to support the notion that fricatives may be more ‘prone’ to tonogenesis than stops, and that amongst the stops, labials may be the last to devoice.
6 Conclusions
This article has presented the results of a production study, targeting voicing and f0 as evidence of ongoing tonogenesis in the Tibeto-Burman language Kurtöp. Production study results from five speakers replicate what was reported in Hyslop (2009); namely we see that f0 is significantly distinct on vowels following voiced versus voiceless obstruents (high following voiceless; low following voiced). We further showed that the phonologically voiced category of obstruents is devoicing, with several tokens being realised with a VOT associated with the voiceless category. We have thus strengthened Hyslop (2009)’s claim that Kurtöp is undergoing tonogenesis.
This article has further queried the fricatives versus stops and examined stops at five different places of articulation (labial, dental, retroflex, palatal, velar) independently. We see that the fricatives are devoicing more frequently than the stops, which we take as evidence that tonogenesis has progressed further amongst the fricatives than the stops. This follows from Hyslop (2009)’s claim that tonogenesis has targeted the sonorants, then the fricatives, and is now spreading through the stops. In a study aimed to target perception of pitch versus voicing in Kurtöp, Peralta (2017) found that speakers were more attuned to f0 in the context of fricatives than when following stops, further supporting our claim here that tonogenesis has spread further following the fricatives.
Within the stops, we see that the retroflexes are devoicing more frequently than any other place of the articulation, followed by the velars, the palatals, and then the dentals and labials. The preference for devoicing of velars over voicing of stops follows from what we know about the intrinsic mechanics of voicing at these different places of articulation (e.g. Ohala and Riordan 1979). Peralta (2017) also targeted place of articulation in his perception study, showing that the different places behaved very differently in terms of listener attention to voicing versus f0. Similarly, speakers were more likely to be attuned to f0 following dorsal consonants than at the other places of articulation.
While we argue these results show tonogenesis happening through different segments in a particular order in the language, it is worth remembering that we can only be confident of change after it has completed. While there is synchronically contrastive tone in Kurtöp following sonorant and palatal fricative initials, there is still variation in the realization of the voicing contrast elsewhere in the phonology. The larger percentage of phonetically voiceless phonologically voiced fricatives suggests that they will phonologise tone first. The statistical significance of the f0 following phonetically voiceless retroflex stops suggests they are the first of the stops to phonologise tone. However, these claims of tonogenesis can be confirmed only once the changes are shown to be completed. Merger reversals can also happen; or sometimes variation can remain as the stable, synchronic realisation of a particular contrast for an undetermined amount of time.
These findings – to the extent with which they predict change in progress – may be explained by functional-typological pressures. For example, we know that typologically, voiced sonorants are common while voiceless sonorants are unusual; and conversely, voiceless obstruents are more common than voiced obstruents. There may be functional pressures that explain this distribution (such as effort required to produce voiced versus voiceless obstruent, or difficulties in perceiving voiceless sonorants). Of course, any intrinsic differences along these lines would not explain why tone would phonologise in the first place, but it could help explain the mergers in consonants that may subsequently happen. That is, if tone is equally acquired following all consonants, we can predict that the more typologically marked consonants produced on the way will be the ones to merge first, thereby being the first to condition tone on their neighbouring vowels.
We can propose a refinement to the model of tonogenesis, based on these findings. Rather than stipulating that obstruents, as a class, will condition high versus low tone on their following vowels and merge to voiceless, we can state more specifically how and when this will happen. The manner at which tone enters language does not seem random and not all aspects of the phonology are targeted equally. Rather, we see that sonority and intrinsic physiological properties of the sounds themselves play a large role in what aspects of the language phonologise tone first. Specifically, we propose here that tone will first target the sonorants, then the fricatives, and then within the remainder of the obstruents the back places of articulation (velar, palatal) will phonologise tone before the front (alveolar, bilabial). Note that this model needs further refinement to confidently account for retroflexes, other complex onsets, and incorporate social and frequency factors. Note also that while this refinement addresses the manner in which tone spreads through a language, it does not predict which languages will develop tone and which others will not. Such a refinement is beyond the scope of this study.
Acknowledgments
We are grateful to James Kirby, Alan Yu, Wentao Gu, and two anonymous reviewers, for substantial comments that helped improve the manuscript. Samantha Soon provided editorial assistance and Gus Wheeler helped with the map and figures. The Sydney Informatics Hub at the University of Sydney provided considerable support for the statistical analysis; in particular we would like to thank Chris Howden and Alexandra Green for the generous contribution of their time and knowledge. We maintain responsibility for any errors or misinterpretations of the data.
-
Ethics statement: This article is part of a larger research project entitled “Reconstructing Eastern Himalayan Histories: Languages, Plants, and People”, funded by the Australian Research Council Discovery Project scheme. The Human Ethics protocol for this larger projects is: 2012/667 and was approved by the Chair of the Humanities & Social Sciences DERC on 13 May 2014. Dr. Gwendolyn Hyslop headed this project.
-
Author contributions: This article is a heavily revised version of the second author’s MA thesis. The first author designed the study, collected most of the data, and presented most of the ideas in the discussion and conclusion. The second author carried out the primary acoustic analysis. Both authors have contributed to the writing and the statistical analysis.
-
Conflict of interest statement: There are no conflicts to report. The first author has been a full-time employee of the University of Sydney over the past three years.
Appendix 1: words used in this study
# | English gloss | Orthographic representation | IPA |
---|---|---|---|
1 | target | <ba> | /bɐ/ |
2 | bundle in the hand | <bam> | /bɐm/ |
3 | wool | <bê> | /be:/ |
4 | cut in chopping manner | <bet> | /bet/ |
5 | give | <bi> | /bi/ |
6 | bamboo mat often used for drying | <bi> | /bi/ |
7 | four | <ble> | /ble/ |
8 | one when counting measurements | <bleng> | /bleŋ/ |
9 | boy, son | <bo> | /bo/ |
10 | 3.pl | <bot> | /bot/ |
11 | cliff | <brâ> | /brɐ:/ |
12 | fly | <brang> | /brɐŋ/ |
13 | measuring cup | <bre> | /bre/ |
14 | smell | <bri> | /bri/ |
15 | look out for | <brin> | /brin/ |
16 | countable seed | <bro> | /bro/ |
17 | fishing net | <brong> | /broŋ/ |
18 | hunger | <brû> | /bru:/ |
19 | root or round tuber of a plant | <bu> | /bu/ |
20 | height | <bung> | /buŋ/ |
21 | room or compartment in a house | <cat> | /cɐt/ |
22 | honorific term for eye | <cen> | /cen/ |
23 | crash together | <cep> | /cep/ |
24 | washing stick | <cha> | /cʰɐ/ |
25 | seedling or sapling | <châ> | /cʰɐ:/ |
26 | separate rice (by hitting) | <cik> | /cik/ |
27 | to be small | <cing> | /ciŋ/ |
28 | criticize | <con> | /con/ |
29 | ability | <cor> | /cor/ |
30 | now | <da> | /dɐ/ |
31 | tone, accent | <dang> | /dɐŋ/ |
32 | enter | <dê> | /de:/ |
33 | sleeping place | <dep> | /dep/ |
34 | large pot | <dî> | /di:/ |
35 | front | <dong> | /doŋ/ |
36 | sleep | <dot> | /dot/ |
37 | recover | <drâ> | /ɖɐ:/ |
38 | hit, beat | <drang> | /ɖɐŋ/ |
39 | mule | <dre> | /ɖe/ |
40 | shadow | <drem> | /ɖem/ |
41 | make ready | <drî> | /ɖi:/ |
42 | middle, moderate, mediocre | <dring> | /ɖiŋ/ |
43 | trunk, box | <drom> | /ɖom/ |
44 | six | <drô> | /ɖo:/ |
45 | boat | <dru> | /ɖu/ |
46 | dragon | <druk> | /ɖuk/ |
47 | log | <dum> | /dum/ |
48 | horn, trumpet, shell | <dung> | /duŋ/ |
49 | evil | <dut> | /dut/ |
50 | saddle | <ga> | /gɐ/ |
51 | go | <ge> | /ge/ |
52 | steep | <gen> | /gen/ |
53 | rubber, thick plastic | <gip> | /gip/ |
54 | revolve, turn around | <gir> | /gir/ |
55 | (male) friend | <gon> | /gon/ |
56 | Himalayan griffin | <got> | /got/ |
57 | winter | <gun> | /gun/ |
58 | tent | <gur> | /gur/ |
59 | two (when counting measurements) | <gwa> | /gwɐ/ |
60 | turn | <gwar> | /gwɐr/ |
61 | tie up (cattle) | <gwe> | /gwe/ |
62 | feel happy | <har> | /hɐr/ |
63 | flask or bottle | <hor> | /hor/ |
64 | win | <je> | /ɟe/ |
65 | bet | <jê> | /ɟe:/ |
66 | to be fast | <jok> | /ɟok/ |
67 | cow slop | <jop> | /ɟop/ |
68 | end | <ju> | /ɟu/ |
69 | evolution, development | <jung> | /ɟuŋ/ |
70 | snow | <ka> | /kɐ/ |
71 | voice | <kat> | /kɐt/ |
72 | accident, bad luck | <ken> | /ken/ |
73 | expand upon | <ket> | /ket/ |
74 | mouth | <kha> | /kʰɐ/ |
75 | needle, hook | <khap> | /kʰɐp/ |
76 | tie up | <khê> | /kʰe:/ |
77 | 3.erg | <khî> | /kʰi:/ |
78 | 3.abs | <khit> | /kʰit/ |
79 | turn | <khor> | /kʰor/ |
80 | cover up | <khup> | /kʰup/ |
81 | herd, gather, bring together | <khur> | /kʰur/ |
82 | water | <khwe> | /kʰwe/ |
83 | dog | <khwi> | /kʰwi/ |
84 | potato | <ki> | /ki/ |
85 | refiner (carpenter tool) | <kit> | /kit/ |
86 | door | <ko> | /ko/ |
87 | loiter, wander around | <kor> | /kor/ |
88 | honorific term for body | <ku> | /ku/ |
89 | place on, impose | <kut> | /kut/ |
90 | tooth | <kwa> | /kwɐ/ |
91 | be chipped | <kwâ> | /kwɐ:/ |
92 | charcoal ashes | <kwe> | /kwe/ |
93 | tip | <kweng> | /kweŋ/ |
94 | trivet | <kwi> | /kwi/ |
95 | cuddle, hug | <pang> | /pɐŋ/ |
96 | leech | <pat> | /pɐt/ |
97 | example | <pe> | /pe/ |
98 | throw | <pet> | /pet/ |
99 | pig | <phâ> | /pʰɐ:/ |
100 | be okay | <phat> | /pʰɐt/ |
101 | edge | <phê> | /pʰe:/ |
102 | clay pot | <pheng> | /pʰeŋ/ |
103 | smaller bamboo type | <phî> | /pʰi:/ |
104 | sweep | <phik> | /pʰik/ |
105 | cave | <pho> | /pʰo/ |
106 | first offering to a deity before eating | <phot> | /pʰot/ |
107 | tear | <phret> | /pʰret/ |
108 | lick | <phrin> | /pʰrin/ |
109 | reminder, left over | <phro> | /pʰro/ |
110 | cheese | <phrum> | /pʰrum/ |
111 | quarrel, create trouble | <phung> | /pʰuŋ/ |
112 | stake | <phur> | /pʰur/ |
113 | get stuck with burrs | <ping> | /piŋ/ |
114 | paintbrush OR funnel | <pir> | /pir/ |
115 | snake | <po> | /po/ |
116 | king | <pon> | /pon/ |
117 | monkey, especially Assamese Macaque | <pra> | /prɐ/ |
118 | wrestle, fight | <prat> | /prɐt/ |
119 | backyard | <prê> | /pre:/ |
120 | to fear, to be afraid of | <pret> | /pret/ |
121 | a religious festival | <priu> | /priu/ |
122 | bring down with a tool | <prô> | /pro:/ |
123 | remove, take off | <prot> | /prot/ |
124 | the long spindle or thread holder | <pun> | /pun/ |
125 | wrap on | <put> | /put/ |
126 | to hang something | <pyung> | /pjuŋ/ |
127 | funnel | <pyur> | /pjur/ |
128 | earth, ground | <sa> | /sɐ/ |
129 | stabilizer, leveler, e.g. something used to level a crooked table by being stuck under the legs | <sap> | /sɐp/ |
130 | son (honorific) | <sê> | /se:/ |
131 | louse | <sê> | /se:/ |
132 | meat OR the 27th letter of Dzongkha Alphabet | <sha> | /çɐ́/ |
133 | ride, mainly as in ride a horse | <shan> | /çɐ́n/ |
134 | glass | <she> | /çé/ |
135 | to overflow due to pouring | <she> | /çé/ |
136 | bamboo shoot, especially the rui shoot. This is softer and more pliable | <shi> | /çí/ |
137 | to get wet | <shir> | /çír/ |
138 | dice, used to play parala | <sho> | /çó/ |
139 | pull weeds OR to have free time | <shok> | /çók/ |
140 | sheath | <shup> | /çúp/ |
141 | strong | <shû> | /çú:/ |
142 | pluck OR politics | <si> | /si/ |
143 | thigh | <sir> | /sir/ |
144 | corncob OR shell, such as egg shell or molted snake skin | <sop> | /sop/ |
145 | refers to larger bamboo which is not native to Dungkar, not used in making bows OR hay | <su> | /su/ |
146 | three | <sum> | /sum/ |
147 | horse | <ta> | /tɐ/ |
148 | tiger | <tâ> | /tɐ:/ |
149 | rib | <tep> | /tep/ |
150 | treasure | <ter> | /ter/ |
151 | plane, field | <thang> | /tʰɐŋ/ |
152 | stove, traditional Bhutanese stove which is made out of clay and uses wood to heat | <thap> | /tʰɐp/ |
153 | insert, get something inside or through | <thê> | /tʰe:/ |
154 | saliva, spit | <thep> | /tʰep/ |
155 | bring down OR drown | <thim> | /tʰim/ |
156 | a place for cows to sleep | <thir> | /tʰir/ |
157 | a measurement, roughly equal to the distance of one human hand OR to be high | <tho> | /tʰo/ |
158 | break, crush OR refers to the top part of a waterfall | <thor> | /tʰor/ |
159 | pattern, can refer to patter on weaving, animal marking, etc. OR multicolored, spotted | <thra> | /ʈʰɐ/ |
160 | arrive | <thrâ> | /ʈʰɐ:/ |
161 | throne | <thri> | /ʈʰi/ |
162 | rotten green cheese paste used in cooking | <thut> | /tʰut/ |
163 | distal pronoun (pronoun for something far from speaker that) | <thû> | /tʰu:/ |
164 | tear, pinch or bite | <tî> | /ti:/ |
165 | tin container, such as one used for storing biscuits, or one used for the Buddhist offering of water | <ting> | /tiŋ/ |
166 | battery, food | <to> | /to/ |
167 | wild apple OR thousand | <tong> | /toŋ/ |
168 | think of or long for | <tra> | /ʈɐ/ |
169 | a large container for adults to bathe in | <trap> | /ʈɐp/ |
170 | the year of monkey in the Bhutanese zodiac | <tre> | /ʈe/ |
171 | cheat | <trem> | /ʈem/ |
172 | wrap around | <tri> | /ʈi/ |
173 | stretch, stretch out OR can, tin | <tring> | /ʈiŋ/ |
174 | village | <trom> | /ʈom/ |
175 | heat | <trot> | /ʈot/ |
176 | transform into something | <trui> | /ʈui/ |
177 | stir | <truk> | /ʈuk/ |
178 | nerves, blood vessels | <tsa> | /tsɐ/ |
179 | rust | <tsâ> | /tsɐ:/ |
180 | tip, summit | <tse> | /tse/ |
181 | salt | <tsha> | /tsʰɐ/ |
182 | refers to udder OR basket used for carrying | <tshang> | /tsʰɐŋ/ |
183 | date OR limit | <tshê> | /tsʰe:/ |
184 | honorific word for ‘name’ OR mark | <tshen> | /tsʰen/ |
185 | joint | <tshi> | /tsʰi/ |
186 | goo, sticky viscous liquid, such as sap | <tshi> | /tsʰi/ |
187 | lake OR dinner, supper, assembly, gathering | <tsho> | /tsʰo/ |
188 | here | <tshô> | /tsʰo:/ |
189 | ready | <tshut> | /tsʰut/ |
190 | madder | <tshut> | /tsʰut/ |
191 | a calculation | <tsî> | /tsi:/ |
192 | imprisonment OR prison | <tson> | /tson/ |
193 | prick (intransitive) | <tsop> | /tsop/ |
194 | lime, calcium oxide | <tsun> | /tsun/ |
195 | lift up, pick up | <tum> | /tum/ |
196 | elect | <tum> | /tum/ |
197 | chase, run after, hunt | <tung> | /tuŋ/ |
198 | fruit OR day of the week | <za> | /zɐ/ |
199 | bronze | <zang> | /zɐŋ/ |
200 | substance | <ze> | /ze/ |
201 | nail, peg, pin | <zer> | /zer/ |
202 | what | <zha> | /çɐ̀/ |
203 | a term of measurement, roughly equal to the length of one man’s body height | <zham> | /çɐ̀m/ |
204 | master, a term used to address a person of a high class, for example a slave addressing his master | <zhe> | /çè/ |
205 | bamboo recorder | <zheng> | /çèŋ/ |
206 | side | <zhi> | /çì/ |
207 | forget | <zhit> | /çìt/ |
208 | bug, insect, pests | <zhong> | /çòŋ/ |
209 | a general term for any alcoholic beverage | <zhor> | /çòr/ |
210 | government | <zhung> | /çùŋ/ |
211 | slice or cut with the knife moving away from the body | <zhû> | /çù:/ |
212 | cats eye stone | <zi> | /zi/ |
213 | number two OR couple | <zon> | /zon/ |
214 | a large wooden box, such as that used to store grains | <zot> | /zot/ |
215 | relic, religious artifact of rare importance | <zung> | /zuŋ/ |
216 | give way, give side | <zur> | /zur/ |
References
Boersma, Paul & David Weenick. 2014. Praat: Doing phonetics by computer (version 5.4.04). Available at: http://www.praat.org/.Suche in Google Scholar
Bosch, André. 2016. Language contact in Upper Mangdep: A comparative grammar of verbal constructions. Honours, University of Sydney.Suche in Google Scholar
Chamberlain, Bradley. 2004. The Khengkha orthography: Developing a language in the Tibetan scriptal environment. Graduate Institute of Applied Linguistics MA.Suche in Google Scholar
Chen, Matthew Y. 2000. Tone Sandhi: Patterns across Chinese dialects. Cambridge Studies in Linguistics 92. Cambridge/New York: Cambridge University Press.10.1017/CBO9780511486364Suche in Google Scholar
Cho, Taohong, Sun-Ah Jun & Peter Ladefoged. 2002. Acoustic and aerodynamic Correlates of Korean stops and fricatives. Journal of Phonetics 30(2). 193–228. https://doi.org/10.1006/jpho.2001.0153.Suche in Google Scholar
Coetzee, Andries W., Patrice Speeter Beddor, Kerby Shedden, Will Styler & Daan Wissing. 2018. Plosive voicing in Afrikaans: Differential cue weighting and tonogenesis. Journal of Phonetics 66. 185–216. https://doi.org/10.1016/j.wocn.2017.09.009.Suche in Google Scholar
DeLancey, Scott. 1989. Contour tones from lost syllables in central Tibetan. Linguistics of the Tibeto-Burman Area 12(2). 33–34.10.32655/LTBA.12.2.04Suche in Google Scholar
DeLancey, Scott. 2003. Lhasa Tibetan. In Graham Thurgood & Randy LaPolla (eds.), The Sino-Tibetan languages, 270–288. London: Routledge.10.4324/9780203214961-40Suche in Google Scholar
Dockum, Rikker. 2019. The tonal comparative method: Tai tone in historical perspective. Boston: Yale University PhD dissertation.Suche in Google Scholar
Driem, George van. 2015. Synoptic grammar of the Bumthang language, a language of the Central Bhutan highlands. Himalayan Linguistics Archive 6. 1–77.Suche in Google Scholar
Dutta, Hemanga & Michael Kenstowicz. 2018. The phonology and phonetics of laryngeal stop contrasts in Assamese. In Roberto Petrosino, Pietro Cerrone & Harry van der Hulst (eds.), From sounds to structures: Beyond the veil of Maya. Berlin: Mouton de Gruyter.10.1515/9781501506734-002Suche in Google Scholar
Dwyer, Arienne M. 2008. Tonogenesis in Southeastern Monguor. In K. David Harrison & Arienne M. Dwyer (eds.), Lessons from Endangered languages, 111–128. Typological Studies in Language 78. Amsterdam: John Benjamins.10.1075/tsl.78.05dwySuche in Google Scholar
Gathercole, Geoffrey. 1983. Tonogenesis and the Kickapoo tonal system. International Journal of American Linguistics 49. 72–76. https://doi.org/10.1086/465766.Suche in Google Scholar
Genetti, Carol. 2009. An introduction to Dzala, and East Bodish Language of Bhutan. Presented at the 13th Himalayan Languages Symposium. In Eugene, OR.Suche in Google Scholar
Goddard, Ives. 1991. Algonquian linguistic change and reconstruction. In Patterns of change, change of patterns: Linguistic change and reconstruction methodology, 55–70. Berlin/New York: Mouton de Gruyter.10.1515/9783110871890.55Suche in Google Scholar
Guion, Susan G., Jonathan D. Amith, Christopher S. Doty & Irina A. Shport. 2009. Word-level prosody in Balsas Nahuatl: The origin, development, and acoustic correlates of tone in a stress accent language. Journal of Phonetics 38(2). 137–166. https://doi.org/10.1016/j.endend.2009.09.002.Suche in Google Scholar
Hagège, Claude & André Haudricourt. 1978. Phonologie panchronique: Comment les sons changent dans les langues. Paris: Presses Universitaires France.Suche in Google Scholar
Hamann, Silke. 2003. The phonetics and phonology of retroflexes. Utrecht: LOT.Suche in Google Scholar
Haudricourt, André. 1954. De L’origine Des Tons En Vietnamien. Journal Asiatique 242. 69–82.Suche in Google Scholar
Hombert, Jean-Marie, John J. Ohala & William G. Ewan. 1979. Phonetic explanations for the development of tones. Language 55(1). 37–58. https://doi.org/10.2307/412518.Suche in Google Scholar
Howe, Penelope Jane. 2017. Tonogenesis in central dialects of Malagasy: Acoustic and perceptual evidence with implications for synchronic mechanisms of sound change. Houston: Rice University.Suche in Google Scholar
Hyslop, Gwendolyn. 2009. Kurtöp tone: A tonogenetic case study. Lingua 119. 827–845. https://doi.org/10.1016//j.lingua.2007.11.012.Suche in Google Scholar
Hyslop, Gwendolyn. 2014. A preliminary reconstruction of East Bodish. In Nathan Hill & Thomas Owen-Smith (eds.), Trans-Himalayan linguistics, historical and descriptive linguistics of the Himalayan area, 155–180. Trends in Linguistics. Studies in Monographs 266. Berlin/Boston: De Gruyter Mouton.10.1515/9783110310832.155Suche in Google Scholar
Hyslop, Gwendolyn. 2017. A grammar of Kurtöp. Leiden: Brill.10.1163/9789004328747Suche in Google Scholar
Hyslop, Gwendolyn. 2021a. Between stress and tone: Acoustic evidence of word prominence in Kurtöp. Language Documentation & Conservation 15. 550–574.Suche in Google Scholar
Hyslop, Gwendolyn. 2021b. Language as a window into the past: Proto East Bodish language and culture. In Diana Lange, Jarmila Ptáčková, Marion Wettstein & Mareike Wulff (eds.), Crossing boundaries: Tibetan studies unlimited, 289–310. Prague: Academia Publishing House.Suche in Google Scholar
Hyslop, Gwendolyn. 2023. Toward a typology of tonogenesis: Revising the model. Australian Journal of Linguistics 42(3–4). 275–299. https://doi.org/10.1080/07268602.2022.2157675.Suche in Google Scholar
Kang, Kang-Ho & Susan G. Guion. 2008. Clear speech production of Korean stops: Changing phonetic targets and enhancement strategies. The Journal of the Acoustical Society of America 124(6). 3909–3917. https://doi.org/10.1121/1.2988292.Suche in Google Scholar
Kingston, John. 2007. Phonological pertinacity/phonetic variability. Talk presented at the University of Oregon colloquium series. Eugene, OR: University of Oregon.Suche in Google Scholar
Kirby, James. 2013. The role of probabilistic enhancement in phonologization. In Alan C. L. Yu (ed.), Origins of sound change, 228–246. Oxford University Press.10.1093/acprof:oso/9780199573745.003.0011Suche in Google Scholar
Kirby, James & Gwendolyn Hyslop. 2019. Phonetic structures of Dzongkha obstruents. In Sasha Calhoun, Paola Escudero, Marija Tabain & Paul Warren (eds.), Proceedings of the 19th International Congress of Phonetic Sciences, Melbourne, Australia 2019, 3607–3611.Suche in Google Scholar
Kingston, John. 2005. The phonetics of Athabaskan tonogenesis. In Sharon Hargus & Karen Rice (eds.), Athabaskan Prosody, 137–184. Amsterdam: John Benjamins.10.1075/cilt.269.09kinSuche in Google Scholar
Lisker, Leigh & Arthur S. Abramson. 1964. A cross-language study of voicing in initial stops: Acoustical measurements. Word 20(3). 384–422. https://doi.org/10.1080/00437956.1964.11659830.Suche in Google Scholar
L-Thongkum, Theraphan. 1997. Implications of the retention of proto-voiced plosives and fricatives in the Dai Tho language of Yunnan Province for a theory of tonal development and Tai language classification. In Comparative Kadai: The Thai Branch, 191–219. Dallas: Summer Institute of Linguistics.Suche in Google Scholar
Maspero, Henri. 1912. Étude Sur La Phonétique Historique de La Langue Annamite: Les Initiales. Bulletin de l’École Française d’Extrême-Orient 12. 1–126.10.3406/befeo.1912.2713Suche in Google Scholar
Matisoff, James A. 1970. Glottal dissimilation and the Lahu high-rising tone: A tonogenetic case-study. American Oriental Society 90(1 (Jan–Mar)). 13–44. https://doi.org/10.2307/598429.Suche in Google Scholar
Matisoff, James. 2003. Handbook of Proto-Tibeto-Burman: System and philosophy of reconstruction. Berkeley: University of California Press.Suche in Google Scholar
Mazaudon, Martine. 1977. Tibeto-burman tonogenetics. Linguistics of the Tibeto-Burman Area 3(2). 1–123.10.32655/LTBA.3.2.01Suche in Google Scholar
Michailovsky, Boyd & Martine Mazaudon. 1994. Preliminary notes on the languages of the Bumthang groups. In Proceedings of the 6th Seminar of the International Association for Tibetan Studies, 545–557. Fagernes: The Institute for Comparative Research in Human Culture.Suche in Google Scholar
Ohala, John J. 2012. The relation between phonetics and phonology. In William J. Hardcastle, John Laver & Fiona Gibbon (eds.), Blackwell handbooks in linguistics: Handbook of phonetic Sciences (2), 653–677. Oxford: Blackwell.10.1002/9781444317251.ch17Suche in Google Scholar
Ohala, John J. & Carol J. Riordan. 1979. Passive vocal tract enlargement during voiced stops. In Speech communication papers, 89–92. Cambridge, MA: Massachusetts Institute of Technology.10.1121/1.2017164Suche in Google Scholar
Peralta, William. 2017. Sequential phonologization: A perceptual study of tonogenesis in Kurtöp. Sydney: University of Sydney Honours Thesis.Suche in Google Scholar
Pittayaporn, Pittayawat. 2009. The phonology of Proto-Tai. Ithaca: Cornell University PhD dissertation.Suche in Google Scholar
Pittayaporn, Pittayawat & James Kirby. 2017. Laryngeal contrasts in the Tai dialect of Cao Bằng. Journal of the International Phonetic Association 47(1). 65–85. https://doi.org/10.1017/S0025100316000293.Suche in Google Scholar
Plane, Sarah. 2016. Role of manner and place of articulation in tonogenesis: A case study with Kurtöp. Sydney: University of Sydney Master’s thesis.Suche in Google Scholar
Ratliff, Martha. 2015. Tonoexodus, tonogenesis, and tone change. In Patrick Honeybone & Joseph Salmons (eds.), The Oxford handbook of historical phonology, 245–261. New York: Oxford University Press.10.1093/oxfordhb/9780199232819.013.021Suche in Google Scholar
Rivierre, Jean-Claude. 1993. Tonogenesis in New Caledonia. In Tonality in Austronesian languages, 155–173. Oceanic Linguistics Special Publication 24. Honolulu: University of Hawaii Press.Suche in Google Scholar
Schertz, Jessamyn & Emily J. Clare. 2020. Phonetic cue weighting in perception and production. WIREs Cognitive Science 11(2). e1521. https://doi.org/10.1002/wcs.1521.Suche in Google Scholar
Stevens, Kenneth N., Sheila E. Blumstein, Laura Glicksman, Martha Burton & Kathleen Kurowski. 1992. Acoustic and perceptual characteristics of voicing in fricatives and fricative clusters. Journal of Acoustical Society of America 91. 2979–3000. https://doi.org/10.1121/1.402933.Suche in Google Scholar
Shryock, Aaron Michael. 1995. Investigating laryngeal contrasts: An acoustic study of the consonants of Musey. UCLA Working Papers in Phonetics 89. 1–117.Suche in Google Scholar
Silva, David. 2006. Acoustic evidence for the emergence of a tonal contrast in contemporary Korean. Phonology 23. 287–308. https://doi.org/10.1017/s0952675706000911.Suche in Google Scholar
Svantesson, Jan-Olaf & David House. 2006. Tone production, tone perception and Kammu tonogenesis. Phonology 23(2). 309–333. https://doi.org/10.1017/s0952675706000923.Suche in Google Scholar
Thurgood, Graham. 2002. Vietnamese and tonogenesis: Revising the model and the analysis. Diachronica 19(2). 333–363. https://doi.org/10.1075/dia.19.2.04thu.Suche in Google Scholar
Wayland, Ratree & Susan Guion. 2005. Sound changes following the loss of /r/ in Khmer: A new tonogenetic mechanism? Mon-Khmer Studies (35). 55–82.Suche in Google Scholar
Wright, Jonathan D. 2007. Laryngeal contrast in Seoul Korean. Philadelphia: UPenn PhD dissertation.Suche in Google Scholar
Wright, Richard & David Nichols. 2009. Measuring vowel duration in Praat. UW Phonetics Lab.Suche in Google Scholar
Yip, Moira. 2002. Tone. United Kingdom: Cambridge University Press.10.1017/CBO9781139164559Suche in Google Scholar
© 2024 the author(s), published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- Research Articles
- The role of place and manner of articulation in Kurtöp tonogenesis: refining the model
- Exploring and explaining variation in phrase-final f0 movements in spontaneous Papuan Malay
- The effects of watching subtitled videos on the perception of L2 connected speech by L1 Chinese-L2 English speakers
Artikel in diesem Heft
- Frontmatter
- Research Articles
- The role of place and manner of articulation in Kurtöp tonogenesis: refining the model
- Exploring and explaining variation in phrase-final f0 movements in spontaneous Papuan Malay
- The effects of watching subtitled videos on the perception of L2 connected speech by L1 Chinese-L2 English speakers