Skip to main content
Article Open Access

Schwa realisation in verbal inflection in two dialogue registers of German spontaneous speech

  • EMAIL logo , , , , and
Published/Copyright: August 15, 2024
Become an author with De Gruyter Brill

Abstract

Word-final schwa in German inflectional suffixes shows varying realisations in spontaneous speech – from full realisations with varying duration to no realisation. While previous research has identified numerous social, distributional, and grammatical factors influencing the variation of phonetic variables in general, it remains unclear how fine-grained functional differences in different registers specifically affect schwa realisation. In this corpus-based study, we compare schwa realisation in two dialogue registers of German spontaneous speech – free conversation and task-based dialogues – which differ only in their communicative goal and therefore have different functional requirements. We find that schwa is rarely realised, though slightly but significantly more often in free conversation than in task-based dialogue. Other factors also promoting schwa realisation across both situations are less frequent verbs and sequences, and IP-final position.

1 Introduction

In spontaneous speech German 1st-person singular verbs can be realised with or without final schwa. This exploratory study investigates whether the realisation of schwa is partly dependent on register differences. We investigate this by comparing task-based dialogues and free conversations, produced by the same pairs of interlocutors. As we will show, the realisation of schwa depends on many factors, such as the phonetic context or the frequency of the verb in question – but crucially it also seems to be motivated by register differences. If that is so, it posits an interesting case for register modelling.

An illustration of the variation of interest is given in Example (1), taken from the BeDiaCo corpus, which will be introduced in Section 2. Here we see that the variable habe (have.prs.1sg ‘[I] have’) can be realised with schwa (1a) and without schwa (1b).

(1)
a.
was ich eigentlich gemacht habe–[haːbə]
what I actually did.ptcp.prf have.aux.prs.1sg
‘what I actually did’
(BeDiaCo n_frei_m15f13_ch2)
b.
nichts irgendwie nebenbei gemacht hab–[hap]
nothing somehow on the side did.ptcp.prf have.aux.prs.1sg
‘didn’t do anything on the side.’
(BeDiaCo i_frei_f10m8_ch1)

Both variants, habe in Example (1a) and hab in Example (1b), would be represented with <habe> in standard written German, with the 1st-person singular form typically being represented with an <e>.

Since we are interested in phonetic variation, we explicitly do not assume that one of these forms is the ‘correct’ or underlying form – as we will argue in this article, different forms are distributed differently depending on the situation. We will therefore not use the (prescriptive) terms deletion (Kohler and Rodgers 2001), elision (Davidson 2006; Kohler 1990; Kohler and Rodgers 2001), or reduction (Johnson 2004; Kohler 1990; Kohler and Rodgers 2001; Lindblom 1963), which would point to an underlying phonological form with a schwa. Instead, we model verbal inflection at the morphophonetic interface as a morphological variable in Labov’s sense (Labov 1963 et passim). We will name the variable based on the written form as given in a standard grammar of German with phonetic variants of schwa realised or not realised. Accordingly, the variable ich male ‘I paint.prs.1sg’ can be expressed as ich male [ɪç maː.lə] or ich mal [ɪç maːl].

The two realisations of 1st-person singular are semantically fully equivalent (apart from avoiding homonyms with the subjunctive forms). While other phonetic variation phenomena such as the widely known English examples of ing being realised as [n] or [ŋ] or of postvocalic /r/ being realised or not (Labov 1963, 1971) are caused (at least to a certain extent) by social factors (cf. Section 1.2.4), it seems unlikely that social differences play into the distribution of the two variants with and without schwa in Example (1) above. We are not aware of any studies pointing to a social-class difference in the realisation of schwa as a verbal inflectional suffix; likewise the variants do not seem to express social meaning. As far as we know, this variation is fairly stable and does not hint at a change in progress. We investigate whether and how such “seemingly unmotivated” variation within the phonetic realisation of schwa occurs in two closely related registers. We are thus interested in the difference between realisations of 1st-person singular verbs with and without schwa. By providing an exploratory corpus-based comparison of free conversations and task-based dialogues we aim to shed light not only on how inflection is realised in both registers, but also on how the functional differences between these two situations affect verbal inflection. By focusing on the use of a small phonetic difference that mostly goes unnoticed we want to explore an interesting case for register modelling.

In this paper we will first describe the most important aspects of verbal inflection and its pronunciation in German (cf. Section 1.1). In Section 1.2, we will discuss possible factors influencing schwa realisation, with a particular focus on the factor register. Section 2 will provide a detailed description of our methodological approach and the corpus used, before presenting the results of the statistical analysis in Section 3. In Section 4, we will discuss our findings.

1.1 Word-final schwa in verbal inflection in German

Verbal inflection in German varies systematically according to person, number, mode, tense, and voice. There are synthetic forms (ich koch-e ‘I cook-prs.1sg’) and periphrastic forms (ich habe gekocht ‘I have.aux.prs.1sg cooked.ptcp.prf’). Inflectional suffixes that form a syllable on their own are unstressed except in rare cases of contrast accentuation.

Furthermore, haben ‘to.have.inf’ (as well as sein ‘to.be.inf’) can occur as both an auxiliary (2a) and a lexical verb (2b), as illustrated for haben in Example (2):

(2)
a.
Ich habe gemalt.
I have.aux.prs.1sg painted.ptcp.prf
‘I painted.’
b.
Ich habe ein Auto.
I have.prs.1sg a.sg.n[acc] car.n.sg[acc]
‘I have/own a car.’

This paper is concerned with word-final schwa in verbal inflection because it is (a) optional, (b) frequent, (c) influenced by many factors, and (d) does not seem to carry functional weight. As we are interested in subtle register differences, schwa realisation is a suitable phenomenon.

Realisation of verb-final schwa is not obligatory. The Duden grammar, which is one of the most popular grammars of German, states that schwa in the word-final position of the 1st-person singular present is regularly ‘elided’ in spoken language (Duden 2006: 451, 1208) or is optional (Duden 2022: 666). For instance, mein-e ‘mean-prs.1sg’ can be realised with schwa [ma͜ɪ.nə] or without schwa [ma͜ɪn], and hab-e ‘have-prs.1sg’ can be realised as [haː.bə] or [ha(ː)p], leading to the final obstruent devoicing typical for stems ending with a voiced obstruent due to syllable-final devoicing in German (Duden 2006: 44).

The expected inflectional suffixes associated with a word-final schwa are {−e} 1st-person singular indicative present and {−te} 1st-person singular indicative preterite for the regular verbs. For the lemma haben there are the forms in (3a) to (3c) that can be expected to end in schwa though only the first two (habe and hatte) occur frequently:

(3)
a.
ich habe ‘I have.prs.1sg.ind
more rarely ich/er/sie/es habe ‘I/he/she/it had.prs.1/3sg.sbjv
(see Footnote 2)
b.
ich/er/sie/es hatte ‘I/he/she/it had.pst.1/3sg.ind
c.
ich/er/sie/es hätte ‘I/he/she/it would.have.pst.1/3sg.sbjv

It is precisely the difference of 1st-person singular verbs realised with and without schwa that is of interest to us. What are the linguistic and non-linguistic factors that influence this variation? Previous studies identify several potentially relevant factors but a systematic investigation of verb-final schwa realisations is missing up to now. In the following section, we give an overview of the factors that have been identified.

1.2 Factors affecting the phonetic realisation of verbal inflection

1.2.1 Segmental and suprasegmental factors

First we will address factors related to the phonetic context and prosody. Several studies on German found that the following context affects the distribution of realisations of inflectional suffixes. For example, Kohler and Rodgers (2001: 115) found that 68 % of possible word-final schwas followed by a vowel are not realised, as illustrated in Example (4). Similarly, Zimmerer et al. (2011, 2014 report that the realisation of the word-final inflectional suffix {−t} in German is strongly influenced by the following context: t is more likely to be realised when followed by either a pause or a vowel while it is less likely to be realised in complex consonant clusters that can be simplified through realisations without t. These results indicate that the following segmental context is a factor influencing the realisation of inflectional suffixes in German, including schwa, see (4).

(4)
a.
habe ich [ha(ː).bə ʔɪç] ‘have I’
b.
hab ich [ha(ː).bɪç]/[hab̥ɪç] ‘have I’

The prosodic structure of the stem could also affect the realisation of inflectional schwas: Eisenberg (2013) suggests paradigmatic compensation, introduced by Raffelsiefen (1995: 33–42), as a relevant factor which describes the correlation between the number of syllables in the 1st/3rd-person plural and in the 1st-person singular. As shown in Example (5a), if the 1st/3rd-person plural is realised as one syllable – which can be achieved through a realisation without schwa – the 1st-person singular, too, is stipulated to be more likely realised monosyllabically – again achieved through a realisation without word-final schwa. In this view, monosyllabic realisations are likely for stems ending with vowels, diphthongs, or liquids.

(5)
a.
Monosyllabic 1st/3rd-pers. pl.
dreh-en ‘turn’ realised as [dʁeːn] rather than [dʁeː.ən]
Monosyllabic 1st-pers. sg.
dreh-e realised as [dʁeː] rather than [dʁeː.ə]
b.
Disyllabic 1st/3rd-pers. pl.
bad-en ‘bath’ realised as [ˈbaː.dən] or [ˈbaː.dn̩]
Disyllabic 1st-pers. sg.
bad-e realised as [baː.də] rather than [baːt]

In contrast, if the 1st/3rd-person plural is disyllabic as in Example (5b) (through realisation with either schwa or syllabic [n̩]), the 1st-person singular is assumed to be more likely to be realised with two syllables, achieved through a realisation of word-final schwa. Disyllabic realisations are, in this view, likely for stems ending in nasals or obstruents, thus additionally avoiding final obstruent devoicing in [baːt].

Another relevant factor is the word stress patterns in German: Inflectional suffixes in German are usually unstressed (Eisenberg 1991) and therefore generally more prone to ‘reductions’. Stem-initial syllables tend to carry word stress and are seen as part of the pointer or address for the retrieval of a word from the mental lexicon (Dell 1986) and therefore hold “a high signal value for the listener” (Kohler 1990: 84, 88). As such, word-initial syllables are more likely to be realised fully (Kohler 1990: 88). In contrast, word-final syllables including suffixes tend to be unstressed – creating the trochaic stress patterns preferred in German. The position in unstressed syllables leads to less articulatory precision, which in turn makes inflectional suffixes more likely to be realised with only some or even none of the inflectional sounds. Kohler and Rodgers (2001: 114–115) report that 23 % of all possible word-final schwas in spontaneous speech are not realised when following a stressed syllable. Seventy-six percent of these word-final schwas are verbal inflectional endings, mostly in high-frequency verbs, as illustrated in Example (4). For many of these an additional orthographically lexicalised form without schwa might exist, e.g. in colloquial emails and chats (cf. Siebenhaar 2020; Storrer 2018).

An alternative and related factor that might play a role for the realisation of final schwas is that speakers might strive for an alternating rhythm of strong and weak syllables and therefore use the optionality of schwa realisations to regulate the rhythmicity of an utterance (see e.g. Selkirk 1984). That means in the given context that final schwas might be realised in order to avoid stress clash, and deleted when otherwise there would be too many unstressed syllables between beats, e.g. in cases where the following word starts with an unstressed syllable. Evidence for these rhythmic effects was found in written corpora and writing studies by Kentner (2018). Similarly, the study conducted by Fleischer et al. (2018) on the optional schwa endings of specific adverbs in Goethe’s letters demonstrates that the lexical word stress of the following syllable yields more realisations with schwa. In Dutch spontaneous speech, Ernestus and Smith (2018) find that polysyllabic instances of eigenlijk ‘actually’ are followed more often by stressed syllables than monosyllabic instances. This suggests that rhythmic considerations are also found in spontaneously produced speech.

Prosodic boundaries are organised hierarchically, with a lower intermediate phrase (ip) and a higher intonation phrase (IP) layer above the level of the phonological word (see for German Grice et al. 2005). Proximity to a prosodic boundary affects the acoustic duration of segments, especially preceding the boundary, in most languages (see Paschen et al. 2022 for a recent study, surveying 25 languages) and also in German (e.g. Belz et al. 2022). Apart from phrase-final lengthening there is also evidence for strengthening effects, e.g. sounds in phrase-final position tend to be produced with larger movements than in phrase-medial position, and contrast enhancement is found in this position (for an overview see e.g. Byrd and Krivokapić 2021; Cho 2011). For example, Piroth and Janker (2004) found incomplete neutralisation of the voicing contrast in German in utterance-final position for some speakers but – as expected for German – no voicing distinction for final obstruents in other phrasal positions. Contrast enhancement or strengthening effects in pre-boundary position are subtle and inconsistent compared to the phrase-initial position, and also compared to the temporal lengthening effects (cf. Belz et al. 2022). Niebuhr et al. (2013) found for spontaneous speech in German that schwa realisations in -en suffixes are more frequent in phrase-final position than in phrase-medial position.

On the phonetic level, articulation rate influences the realisation of word-final schwa: Faster speech rates are often correlated with less frequent realisations of word-final schwas and other inflectional affixes as well as more ‘reduced’ pronunciations of function words (Bell et al. 2003; Davidson 2006; Ernestus et al. 2015). Davidson (2006) considers fewer schwa realisations in faster speech to be the result of gestural overlap, as gestural precision is hindered by time pressure. However, there is ample evidence that this is not a mechanistic relationship, i.e. faster speech rates do not always automatically lead to more frequent ‘reductions’, see e.g. Van Son and Pols (1990, 1992, Kienast and Sendlmeier (2000), and Koreman (2006). Ernestus (2014) and Ernestus et al. (2015) found that the frequency of ‘reductions’ is not completely determined by time pressure due to faster speech rates but might also underlie the control of the speaker and could be mediated by register.

1.2.2 Morphosyntactic factors

In order to maintain morphological unambiguousness, a full realisation of inflection may sometimes be required. For instance, to preserve the distinction between arbeitet ‘work.prs.3sg.ind’ and arbeitete ‘work.pst.3sg.ind’ a variant with full realisation of the word-final schwa in arbeitete could be necessary (Kohler and Rodgers 2001: 112). However, the following segmental context seems to have a stronger influence on the realisation of suffixes than morphological properties: Zimmerer et al. (2011, 2014 show that word-final t in German is more likely to be realised the more morphologically meaningful it is (e.g. t in {–t} [3sg] is more meaningful than in {–st} [2sg] as it is the sole carrier of morphological information) although segmental context, especially cluster simplification, seems to override morphological effects. Analogously, we expect morphological unambiguousness to be one factor influencing the realisation of word-final schwa, but the segmental context might prove to be a stronger influence.

Additional syntactic factors such as word category might have an impact on schwa realisation as well; specifically, whether a verb is lexical or auxiliary. However, auxiliaries in German are more frequent than lexical verbs. Thus, word category could interact with frequency effects (cf. Section 1.2.3), and separating word category and frequency effects might be a difficult task.

1.2.3 Frequency effects

The more frequent a word, the higher the probability for that word to have variants with ‘reduced’ phonetic realisation: Pluymaekers et al. (2005, 2006 report that higher word frequencies lead to shorter acoustic realisations of affixes or single segments of affixes in Dutch.

For some highly frequent and thus frequently ‘reduced’ words, Kohler and Rodgers (2001: 119) debate the existence of an additional lexicalised form without schwa (ich habe vs. ich hab ‘I have’, ich wäre vs. ich wär ‘I would be’), which, according to our understanding, would be two variants of one variable. As was found for schwa optionality in French by Brand and Ernestus (2018), native French listeners store these variants and additionally use the relative frequencies to access these variants in the mental lexicon (for more examples and theoretical implications see also Bürki 2018; Bürki et al. 2011; Ernestus 2014; Pinnow et al. 2017).

Johnson (2004: 20–21) argues that “‘citation forms’ for function words are [an] inadequate representation of their pronunciations and that function words should generally be modeled differently (perhaps with multiple citation forms)”. The author suggests word category as an additional factor promoting full realisations in English, with content words being more likely to be fully realised than function words. This tendency of highly frequent words to be less likely used in their full variants could be a consequence of more practice leading to more compressed motor routines (Bybee 2001: Ch. 1).

Another explanation is suggested by the probabilistic reduction hypothesis (Jurafsky et al. 2001), which argues that production variability is a result of the predictability and frequency of lexical items: Especially in high-frequency function words, predictability through adjacent words furthers the use of variants with ‘reduced’ realisations. This finding is reproduced by Hanique et al. (2010: 934) for word-initial syllables in Dutch and by Kohler and Rodgers (2001: 119) for German. Hume (2004: 189–192) shows that this effect is reinforced when either the message is predictable or when adjacent words frequently occurring together project the target word. Hence higher predictability seems to lessen the need for articulatory precision, as the listener is less dependent on a full realisation alone. Bell et al. (2009: 107–108) argue that this contextual predictability might correlate with lexical activation as well as the connection of words on both the prosodic and the articulatory level: The more closely words or segments are associated with each other (thus forming frequent sequences), the more likely it is for them to be retrieved from the mental lexicon as one single unit. These units in turn tend to be pronounced as one prosodic word, resulting in a shorter phonetic realisation.

However, while predictability is generally assumed to affect function words more strongly than content words (Kohler and Rodgers 2001: 119), Bell et al. (2009) argue that this only holds true for high-frequency function words – suggesting that word frequency might have a stronger influence on the realisation of inflectional endings than word category (lexical vs. auxiliary verb), although both factors might overlap.

1.2.4 Register differences

So far, we have discussed influencing factors that are in the speech signal itself and factors that have to do with properties of the verb, such as syntactic function or frequency. There is another set of factors that have been shown to influence linguistic variation phenomena: register. It has long been noted that speakers vary their language according to situation and function. This has been researched under many terms – register, diaphasic variation, contextual styles, etc. – and with different methods (Biber and Conrad 2019; Halliday 1978; Labov 1963; see Lüdeling et al. 2022 for an overview). Phonetic variation has been the earliest focus of register variation studies. Labov (1963, 2006 distinguishes several registers (contextual styles in his terminology: casual speech, careful speech, read speech, word list, and minimal pairs) that can be elicited in the same interview with the same speaker, yet differ in the amount of attention paid to speech. Situational factors such as the social-role relationship between the interlocutors or the purpose of the conversation influence linguistic behaviour on all levels. Most of the influences are quantitative rather than qualitative, i.e. a specific variant is used with higher frequency in a given bundle of factors than in another. Variation is, of course, influenced by many other factors. In addition to situational factors, Labov and many others have also found that properties of the interlocutors and their social-role relationship play a role in variation (Biber and Conrad 2019; Labov 1963, 1971, 2006; Szmrecsanyi 2019; Wieling 2012, among many others). Bell’s audience design describes how different social-role relationships may influence speech: Bell (1984) and Bell et al. (2009) show that speakers adapt their language style depending on their addressee. The same newscaster may vary their pronunciation when broadcasting on different radio stations with different listeners.

Variation may indicate change (Croft 2010; Labov 1971; Ohala 1989). The variation regarding schwa that we are considering is, however, stable. It can be observed since Middle High German and is also frequent in the non-standardised writing of Early New High German. Thus, alternation between realisations with or without word-final schwa in verbal inflection can be considered a stable situation, hardly a change in progress or a social phenomenon (Szulc 2014).

While there are many studies on phonetic variation that also take register into account, there are only few register studies in German comparing schwa realisation. Kohler and Rodgers (2001: 114–115) compare read and spontaneous speech, and find that word-final schwas are absent in 2.6 % of all cases where schwa follows a stressed syllable in read speech, as opposed to 23 % in spontaneous speech. However, Kohler and Rodgers (2001) do not distinguish between inflectional suffixes and other word-final schwas, although they do note that a high proportion of frequent function words are realised in the variant without schwa.

In the following we describe the registers under investigation in this study with the help of the situational characteristics compiled by Biber and Conrad (2019: 40), building on, inter alia, the differentiation of register factors defined by Halliday (1978). In terms of ‘mode of discourse’ (channel and its application, e.g. spoken vs. written), the register is situated in transient mode, using spontaneous speech in an unscripted face-to-face setting, publicly, with shared time and place of the participants. In terms of ‘tenor of discourse’ (aiming at the relationship between speakers), there are two participants in each spoken dialogue, both of whom can choose the role of speaker and addressee depending on conversational roles during the conversation, without a fixed assignment. They may have different social backgrounds, but all are university students in their early twenties (cf. Section 2.1 for more details on speaker metadata). No on-lookers are present, except for the experimenter, who is visible to only one of the participants. The participants do not know each other. In terms of ‘field of discourse’ (social activity, content), the registers differ in their potential communicative purpose and topic. In the free conversation, participants are given an introductory topic (food in the university canteen) to converse about, but are free to deviate from that. The communicative purpose could be described as something between “getting to know each other” and “small talk”. In the task-based conversation, the participants have to cooperate to find differences in two versions of a picture in front of them by only employing the spoken channel (Diapix task, cf. Baker and Hazan 2011; Van Engen et al. 2010, described in Section 2.3). The topic comprises pictures of everyday life, e.g. a farm or a street. The communicative purpose is to work together and find all the differences in a fixed amount of time. Thus, in our set-up (for more on that see Section 2.2), the interlocutors and other situational factors are kept stable but the participants (all from the same dialectal region and in the same age range) are given two different tasks, therefore marking two registers. There is no indication that the variation under consideration here is in any way socially charged.

In summary, previous research has shown that, among other factors, grammatical, segmental and prosodic factors, frequency, and social status could potentially influence the realisation of inflectional schwa. However, none of these factors makes a strong prediction on how precisely schwa realisation differs in the two registers under consideration.

1.3 Research question

We conduct an exploratory corpus-based comparison of two registers of German spontaneous speech (free and task-based dialogues, described in more detail in Section 2.2), asking whether schwa in inflectional verbal suffixes is realised in these registers. Our hypothesis is that differences in schwa realisation are the result of situational-functional differences between these two registers.

Although the phonetic realisation of verbal inflection in German has been studied extensively (cf. Section 1.2), it is still unclear how register differences in spontaneous unscripted speech come into the picture. Due to the exploratory nature of this study, no clear-cut hypotheses concerning the register difference can be proposed. The time constraints typical for the task-based dialogue together with the pressure for coordinative precision and the need for a potentially higher planning capacity might lead to fewer realisations of schwa in task-based dialogue than in free conversation. For Danish, Watson et al. (2020: 66) found a lower speech rate in task-based dialogue compared to free conversation, hinting at higher processing costs but no effect of time pressure on articulation rate. This suggests that schwa might be realised more often in task-based dialogue than in free conversation. All in all, the effect of both higher planning demands and a potentially lower speech rate than in free conversation makes it hard to hypothesise about the direction of the effect of schwa realisation for each register.

2 Methods

We employ a subcorpus of the Berlin Dialogue Corpus (BeDiaComain v.2, Belz et al. 2021), including face-to-face dialogues in two situations. All annotations marked during this study are published in BeDiaCo v.3 (Belz and Mooshammer 2023), available at the media repository of the Humboldt-Universität zu Berlin for teaching and scientific research. More details about the corpus and its annotation guidelines can be found in Belz et al. (2023).

2.1 Participants

We used eight recordings with a total of 16 participants, provided by six female and ten male participants (cf. Table 1). All participants were native speakers of German and between 19 and 31 years old. The participants originated from the northern half of Germany. The general impression after annotating the data was that most participants displayed little to no dialectal pronunciation. The recordings took place between 2018 and 2019 and lasted approximately 1 h each. The participants were paid €10 per hour. Table 1 shows an overview of the corpus.

Table 1:

Corpus properties of BeDiaComain v.3, excluding silent pauses, incomprehensible passages and non-linguistic tokens such as laughing and background noises.

Corpus Speakers Dyads Tokens Articulation
dipl norm Time
BeDiaComain 16 8 41,036 41,260 3 h
(6 female, 10 male) (2 diapixes, 1 free dialogue)

2.2 Tasks

The participants were asked to solve two spot-the-difference-tasks, referred to as Diapix tasks, and to have a free conversation about a topic of their choice in a face-to-face situation. The Diapix task was developed by Van Engen et al. (2010) as a dialogue elicitation procedure in which the interlocutors collaborate to find differences between two highly similar pictures without seeing each other’s versions. The words quoted on the pictures provided by Baker and Hazan (2011) were translated from English into German. Two different Diapix images were used for each task (Street 1 and Farm 1).

2.3 Procedure

Prior to the recordings, the participants filled out the metadata forms. For the recordings, the participants were equipped with two headsets from Beyerdynamics (Headset Opus 54); the acoustic signal was recorded with a sampling rate of 44,100 Hz. Both microphones were connected to a preamplifier, and one channel was assigned to each speaker. Each session began and ended with a word list that each participant had to read aloud. The tasks were carried out in a fixed order: a first reading of a word list containing all German monophthongs in a stressed syllable, the first Diapix task, then the task-free conversation, the second Diapix task, and a second reading of the word list. The two Diapix tasks took about 5 min each, while the conversation time was about 15 min. Together with task descriptions by the experimenter, acoustic fine-tuning, and adjustments, the experiment lasted no more than 1 h.

2.4 Annotation

BeDiaCo uses a multi-layer standoff architecture with multiple segmentations. The segmentation of sound was carried out automatically using the WebMAUS service (Schiel 1999). The diplomatic transliteration (tier dipl) was normalised semi-automatically (tier norm) for the purpose of this study, based on orthographic Standard German target hypotheses. For the normalisation, all pauses, interruptions, and non-understandable as well as anonymised word tokens were excluded. Normalisation thus splits spontaneously uttered tokens on dipl such as isses ‘is it’ or hamwa ‘we have’ into two word tokens on norm which are ist es and haben wir, respectively (cf. Figure 1).

Figure 1: 
Example of the annotation layers in BeDiaComain. The layer names are on the right. In this case, we expect schwa (“e”) on the layer flexexp, which is not realised (“f” on flexreal). The syllable of the inflected verb in which schwa is not realised is cliticised (“c” on stress), and the following context is therefore re-categorised as unstressed.
Figure 1:

Example of the annotation layers in BeDiaComain. The layer names are on the right. In this case, we expect schwa (“e”) on the layer flexexp, which is not realised (“f” on flexreal). The syllable of the inflected verb in which schwa is not realised is cliticised (“c” on stress), and the following context is therefore re-categorised as unstressed.

Further, part-of-speech tags (POS) based on STTS 2.0 (Westpfahl 2014; Westpfahl et al. 2017) and lemmas were created fully automatically with the help of TreeTagger (Schmid 1994) and inserted as additional annotation layers lem and pos. Morphological information, such as person, number, tense, and mood, was added automatically with the help of DEMorphy (Altinok 2018) and CISTEM (Weissweiler and Fraser 2018) to all 6,045 verbs. A Python program was used to assemble these layers in a single step (Lange 2021). Subsequently, norm, pos, and gram were corrected manually and extended by layers that list the inflectional suffixes as given in the standard German grammar (flexexp) as well as the actual realisation of these (flexreal), and detailed phonetic information on the realisation (flexmau, based on the WebMAUS segmentation; cf. Figure 1). All annotations were published in Version 3 of BeDiaCo.

The realisation of schwa in verbal inflectional suffixes was assigned to two categorical tags during the manual annotation process: full and no realisation of schwa. The inflectional suffix <-e> was annotated as a full realisation if the acoustic signal showed clear periodic stretches for the schwa. If the following word token began with a vowel, the following criteria were used to set boundaries to the next segments: the movement of the first and second formants, and/or glottalisation or glottal closure. The tag no realisation was used if there was no schwa visible in the sonagram or in the audio signal in the form of periodicity and/or clear change of F1/F2 structure to the following segments. The annotation criteria for the past tense suffix <-te> were the same for the schwa, while another annotation tag, partial realisation, was used if only a part of the schwa ending was realised (<-te> [tə] to <-t> [t]). Additionally, the POS tags for haben ‘to have’ (starting with either VV for lexical verbs or VA for auxiliaries) were corrected manually, since the tagger does not distinguish between the use as a lexical verb or auxiliary verb, with auxiliary being assumed in most cases.

The intonation phrase tier was annotated manually. The boundary of an intonation phrase was marked obligatorily if the phrase ended with a breath pause or a boundary tone. Additionally, the phrasal boundary could be signalled by segmental lengthening, laryngealisation, tonal movement, or a silent pause longer than 50 ms (Belz 2021: 81–90, in detail Belz et al. 2023). Finally, the factor stress of the following syllable was manually annotated for all verbs on an additional tier based on the auditory impression.

2.5 Analysis

A time-aligned database with acoustic information was created by using the R package emuR v.2.3.0 (Winkelmann et al. 2021). In Table 2, we describe the factors that potentially influence schwa realisation (cf. Section 1).

Table 2:

Potential independent predictors influencing schwa realisation.

Predictor Levels
Segmental/suprasegmental:

Following context Pause/ Obstruent/ Sonorant/ Vowel
Paradigmatic compensation Monosyllabic stem/ Disyllabic stem
Stress of the follow. syllable Stressed/ Unstressed/ Pause
IP position Final/ Non-final
Global articulation rate Numeric
Local articulation rate Numeric

Morphosyntactic:

Tense Present/ Preterite
Part of speech Lexical verb/ Auxiliary verb
Adjacent ich Yes (preceded or followed by ich ‘I’)/
No (without adjacent ich ‘I’)

Frequency-related:

Lemma frequency Numeric

Register:

Situation Diapix/ Free conversation

Following context refers to the context to the right of the potential schwa location and includes pauses, obstruents, sonorants, and vowels. For determining the factor paradigmatic compensation, we extracted all verbal lemmas (= 81) and assigned the corresponding level manually. Stress of the following syllable includes the phonetically realised actual word accent (pause vs. stressed vs. unstressed/cliticised) and was annotated manually by listening to all words following the verbs in question. IP position was already annotated manually in the corpus and refers to a cohesively realised intonation unit with at least one phrase and boundary accent. To obtain the global articulation rate, we divided the number of all syllables per task (Diapix 1, Diapix 2, free dialogue) and speaker by the articulation time (in seconds, excluding silent pauses). The articulation rate was calculated by using the R package sylly 0.1–6 (Michalke 2020) with sylly.de 0.1–2 (Michalke 2017). After all tokens of the pronunciation-based transcription on the dipl tier were collected, syllable numbers were counted automatically excluding extra- and paralinguistic events. Cliticised words and truncations were counted based on the dipl tier, e.g. finds (finde es ‘find it’) counts as one syllable as illustrated in Figure 1. Similarly, the local articulation rate was calculated per IP. Tense was annotated semi-automatically (cf. Section 2.4). Components of periphrastic constructions were marked as auxiliary verbs or past participles, but only the tense of the auxiliary verb was annotated (cf. Example [2a]). Part of speech was inserted semi-automatically (cf. Section 2.4). We grouped modal verbs and auxiliary verbs under auxiliaries. Adjacent ich consists of verbs preceding or following the pronoun ‘I’ or the lack thereof and will be motivated further in Section 3, Results. Lemma frequency contains the absolute frequency of lemmas per situation. Values were centred and logarithmised to the natural base. Situation includes the situation, that is whether the interlocutors converse freely or solve Diapix tasks.

Factors influencing schwa realisation were identified by a binary logistic regression analysis using the R package lme4 (Bates et al. 2015), and p-values of the regression model were obtained by lmerTest (Kuznetsova et al. 2017). Within our model, schwa realisation (as a binary variable with the reference level with schwa vs. without schwa) was included as the response variable; the included independent variables are shown in Table 2. Speakers and lemmas were included as random intercepts to account for idiosyncratic and word differences. Factor contrasts were treatment coded.

We employed an informed top-down approach, starting with a model containing all main predictors without interactions and successively removing factors that did not significantly improve the model in order to reduce complexity. After specifying this preliminary model, meaningful interactions were tested and added to the model if they significantly improved it and at the same time did not cause variance inflation. Significance was tested by log-likelihood tests, whereby a model reduced by one predictor was compared to the previous model through an analysis of variance (ANOVA). Variance-inflation factors were calculated using the R package car (Fox and Weisberg 2019) to account for collinearity between the different predictors in the model. All values are below 2.5, which suggests no collinearity between the predictors (Johnston et al. 2018). The explained conditional variance R c 2 (the variance explained by both fixed and random factors; see Nakagawa and Schielzeth 2013) is calculated using the R package MuMIn (Barton 2018). The effects are visualised by extracting the predicted probabilities from the model using ggeffect() from the ggeffects package (Lüdecke 2018).

In total, 1,229 of 6,045 verbs were identified as 1st-person singular indicative present or preterite. After subtracting irregular verbs and suppletive forms, the dataset includes 872 inflected verbs potentially ending in schwa. Seven cases had to be excluded from the analysis because the type of schwa realisation was unclear, for example due to noise. We further excluded ten unclear cases from stress as well as following context. We re-categorised 25 clitic forms in stress as unstressed. Eleven instances with pause as following context are similar to the right IP boundary and are therefore excluded for the analysis of the IP-non-final subset. The final dataset to be analysed contains 855 verb forms. The dataset and R code are available at https://osf.io/b4sxw (accessed 7 July 2024).

3 Results

Overall, schwa is realised in 23.7 % and not realised in 76.3 % of all 855 instances. These numbers are not equally distributed across registers (cf. Table 3). The free dialogues show a significantly higher number of schwa realisations than the task-based Diapix dialogues (χ2 = 15.8, df = 1, p < 0.001).

Table 3:

Schwa realisation for all verbs potentially ending in schwa in the 1st-person singular indicative per register (proportions are given per column).

Diapix Free Total
Potential word-final schwa: 182 673 855
Thereof
– With schwa 23 (12.6 %) 180 (26.7 %) 203 (23.7 %)
– Without schwa 159 (87.4 %) 493 (73.3 %) 652 (76.3 %)

Table 4 shows the number of schwa realisations per situation for each categorical predictor. We see that the task-based Diapix dialogues behave differently with respect to tense. In detail, only 3 of 182 verb forms in the Diapixes are realised in preterite. Other factors also show very low instances in some categories; for example, there are no instances of verb forms with schwa used in auxiliaries in the Diapixes. It seems that present tense, lexical verbs, and verbs next to ich are indicative of the speech used to solve a Diapix task. This is in line with our pre-assumption that there will rarely be any preterite cases in this situation. As the three instances in the preterite category of tense do not warrant a statistical analysis, we exclude tense from the models.

Table 4:

Instances of schwa realisations per situation (Diapix, free dialogue) and predictor levels.

Predictor Levels Diapix Free
[ə] [ə]
IP position Final 3 5 56 24
Non-final 20 154 124 469
Stress Stressed 8 30 52 99
Unstressed 13 123 88 373
Pause 2 6 40 21
Para. comp. Disyllabic 18 131 157 453
Monosyllabic 5 28 23 40
Tense Present 21 158 114 460
Preterite 2 1 66 33
Follow. context Pause 1 6 37 21
Obstruent 11 34 70 146
Sonorant 1 15 17 65
Vowel 10 104 56 261
Adjacent ich No 3 6 85 45
Yes 20 153 95 448
Part-of-speech Auxiliary 0 10 39 174
Lexical verb 23 149 141 319
  1. Correction added November 11, 2024 after online publication August 15, 2024: the table contained incorrect numbers. These have now been corrected.

In order to test whether the significant difference in schwa realisation per register (higher occurrences in free than in Diapix conversations) will be persistent when including segmental, suprasegmental, morphosyntactic, and frequency factors known to affect schwa realisation, we build two generalised linear mixed-effects models with schwa realisation as the binary response variable. The first model contains all IP instances, while the second model only contains IP-medial instances. This is motivated by the fact that the IP position is confounded with following context, as pauses following a verb form with potential schwa often coincide with IP-final position (54 out of 88). The same applies to stress. The first model (theoretical conditional variation explained: R c 2 = 0.39) includes all verb forms. Paradigmatic compensation, global articulation rate, local articulation rate, and part-of-speech do not improve the model significantly. No meaningful interactions between factors were identified.

Model 1: realisation ∼ IP position + adjacent ich + situation + lemma frequency + (1|lemma) + (1|speaker)

The second model (theoretical conditional variation explained: R c 2 = 0.43) only includes verb forms which are not in the final IP position. Again, paradigmatic compensation, global articulation rate, local articulation rate, and part-of-speech do not improve the model significantly. Tense is again excluded due to data scarcity. The interaction between stress and following context was included, as it improved the model significantly.

Model 2: realisation ∼ adjacent ich + situation + lemma frequency + stress of the following syllable * following context + (1|lemma) + (1|speaker)

In order to determine how much variance is explained by the factor situation, we compared R c 2 of both models to R c 2 calculated for models without situation. Situation thus explains an additional 2.8 % of the variance for Model 1 and 3 % for Model 2, respectively.

Table 5 summarises the predictions of both models. The models are comparable with respect to their changes in predictors. As with schwa is the reference level represented with 0 in the model, the negative slope indicates a movement towards schwa realisation.

Table 5:

Estimates (with standard errors in parentheses) of generalised LME models. Model 1 contains both final and non-final IP position, Model 2 only non-final IP positions. The reference level is with schwa, i.e. a positive slope indicates less frequent schwa realisations.

Model 1 Model 2
(Intercept) −0.71 (0.46) 0.48 (0.54)
Situation Free −0.89 (0.29)** −0.90 (0.32)**
IP position Non-final 1.53 (0.31)***
Adjacent ich Yes 1.84 (0.26)*** 1.83 (0.31)***
Log. lemma frequency 0.49 (0.14)*** 0.52 (0.17)**
Follow. context Sonorant 0.38 (0.71)
Follow. context Vowel −0.38 (0.41)
Stress Unstressed −0.11 (0.36)
Follow. context Sonorant: Stress Unstressed 0.44 (0.83)
Follow. context Vowel: Stress Unstressed 1.91 (0.51)***
Num. obs. 855 756
Num. groups: lem 81 73
Num. groups: speaker 16 16
  1. ***p < 0.001; **p < 0.01.

In Model 1, non-final IP position leads to significantly higher amounts of unrealised schwa, meaning that schwa is significantly more likely to occur at IP boundaries than within intonation phrases (see also Figure 2a). Turning to Model 2, schwa is realised significantly more often in free than in task-based situations (see Figure 2b). If the verb form is preceded or followed by the pronoun ‘I’, schwa is significantly less often realised than if the verb form does not occur together with ‘I’ (Figure 2c). It should be noted that 305 out of 676 cases involve a following ‘I’, as in hab ich ‘have I’. Frequent lemmas lead to significantly fewer realisations of schwa, meaning that infrequent verbs tend to be realised with schwa (Figure 2d). Finally, if the next syllable starts with an unstressed vowel, schwa frequency is reduced significantly (Figure 2e).

Figure 2: 
Predicted marginal effects proportions of verb forms realised without schwa (95 % confidence intervals) for (a) IP position in Model 1. Plots (b)–(e) show the effects of Model 2 for situation, adjacent ich, lemma frequency (back-transformed), and following context in interaction with stress.
Figure 2:

Predicted marginal effects proportions of verb forms realised without schwa (95 % confidence intervals) for (a) IP position in Model 1. Plots (b)–(e) show the effects of Model 2 for situation, adjacent ich, lemma frequency (back-transformed), and following context in interaction with stress.

Neither global nor local articulation rates improve the model of schwa realisation as presented above. Although interlocutors tend to speak faster (globally) in the free dialogues (5.5 syllables/second) than in the Diapix tasks (4.9 syllables/second), this is irrelevant to the realisation of schwa. Evidence is given in Figure 3, showing no effect of the local articulation rate per verb form on schwa realisation in both situations. This means that interlocutors do not speak faster within the same intonation phrase in which they realise verb forms without schwa compared to realising verb forms with schwa.

Figure 3: 
Local articulation rate per verb form with schwa (grey) and without schwa (white) in Diapix tasks and free dialogues.
Figure 3:

Local articulation rate per verb form with schwa (grey) and without schwa (white) in Diapix tasks and free dialogues.

4 Discussion

In summary, our results show that the variant without schwa is the most frequent realisation in verbal inflectional endings for the 1st-person singular (76.3 % of all cases with potential schwa, n = 652). Final schwas are promoted by occurrence before IP boundaries, in the preterite, and in less frequent verbs. Furthermore, verbs that are not preceded or followed by ‘ich’ are also produced more frequently with schwa. Paradigmatic compensation, articulation rate, and verb type (auxiliary or lexical verb) did not affect the realisation of schwa. Concerning our research question, i.e. whether register is also a factor affecting the frequency of inflectional schwa realisations, our results indicate that schwa realisations are significantly more likely in free conversations as opposed to task-based dialogues. In this section, we will first discuss the effect of register and then the other factors, also in relation to register.

In this study, two closely related registers of unscripted spontaneous speech are compared in regard to schwa realisation, task-based versus free conversation. The ‘mode’ and the ‘tenor of discourse’ (Biber and Conrad 2019; Halliday 1978) are kept identical for the two registers. Since both interlocutors are students about the same age, the speakers in this study have a similar socio-economic background and therefore effects of social relationship (see, e.g. Eckert and Labov 2017; Labov 1971) are minimised in our corpus.

The variants investigated here, the verbal inflection for 1st-person singular with and without schwa, are semantically equivalent and – at least to our knowledge – they do not carry any social meaning. Therefore, we can exclude these factors as possible triggers for the observed difference in schwa realisations between free and task-based conversations. These two similar registers differ in their content and potential communicative purpose (‘field of discourse’) (Biber and Conrad 2019; Halliday 1978).

In the introduction, we speculated that due to higher processing costs speakers speak slower in the Diapix task compared to the free conversation. Indeed, we found the global and the local articulation rate to be faster in the free dialogues than in the Diapix dialogues, reflecting the higher cognitive demands needed during solving the task of finding differing pictures, and corroborating Watson et al. (2020). As stated by, e.g. Lindblom (1990) and Kohler (1990), fast speech leads to shorter word and segment duration and to more frequent reduction phenomena such as assimilation, segment elisions, and vowel reduction in read speech (cf. Browman and Goldstein 1990; Davidson 2006; Hoole and Mooshammer 2002), as well as in spontaneous speech (cf. Bell et al. 2003; Ernestus 2014; Fosler-Lussier and Morgan 1999; Hanique et al. 2010; Raymond et al. 2006). However, increased time pressure does not automatically lead to more frequent ‘reductions’, see e.g. Van Son and Pols (1990, 1992, Kienast and Sendlmeier (2000), and Koreman (2006). Ernestus (2014) and Ernestus et al. (2015) argue that phonetic reduction phenomena are under the control of the speaker since – as they found – ‘reductions’ can also depend on the socio-economic status of the speaker. Our results give further evidence for this non-mechanistic view: With free conversations exhibiting a faster articulation rate while featuring more realisations with schwa compared to the Diapix task, our results indicate that a faster articulation rate does not necessarily lead to fewer realisations with schwa, contrary to the findings of, for example, Hanique et al. (2010) and Ernestus et al. (2015), as well as the assumptions of Kohler (1990).

Other factors also affected the frequency of schwa realisations, such as IP position, word frequency, stress of the following syllable, the following phonological context, and adjacent ich, but these factors cannot be discussed independently of the investigated registers. For example, as predicted from the literature (Belz et al. 2022; Niebuhr et al. 2013; Piroth and Janker 2004), we also found more frequent schwa realisations in IP final position, indicating phrase-final strengthening (Byrd and Krivokapić 2021; Cho 2011). However, the position of the finite verb within prosodic phrases is also distributed in a register-specific way: In task-based dialogues, only eight instances of the verb forms investigated in this study were in phrase-final position, compared to 80 in free conversations (see Table 4). A similarly uneven distribution was observed for adjacent ich, part-of-speech and tense: Only very few instances of verbs without ich, auxiliary verbs and verbs in preterite occur in the Diapix tasks. Even though these skewed distributions hampered the statistical analyses, the distribution of these factors is strongly influenced by the register, suggesting they act as linguistic features constituting a register (Lüdeling et al. 2022). We find a clear preference for non-final IP position, present tense, lexical verbs, and occurrences with an adjacent ich in the Diapix task: Participants seem to produce similar sentence structures, with the finite (lexical) verb in the present tense in second position within the sentence, and seem to vary little. Since the Diapix task is firmly situated in the present, the usage of mostly present tense and few auxiliaries (with auxiliaries usually being used in the periphrastic past tense; cf. Section 1.1) is perfectly plausible. The time pressure in the Diapix task might lead to less complex and less diverse sentence structures in order to ensure the interlocutor’s immediate understanding. Hence, these features follow the specific demands of the communicative situation. In contrast, free conversation exhibits more features that tend to promote realisations with schwa. With no task-specific time pressure and a generally more leisurely conversation flow, speakers are more flexible and vary more in free conversations.

The factor following context is interesting with respect to the pronoun ich ‘I’ frequently following the verb, in which case the verb form tends to be realised without schwa. As the personal pronoun ich in spontaneous speech is mostly unaccented and starts with a vowel (only 10 % of all ich tokens are produced with a glottal stop before the vowel in Wesener 1999), the realisation without schwa could either be due to a hiatus avoidance strategy, or a resyllabification of the last segment of the verb stem. As we find hab ich ‘have I’ very frequently in the data, it may be hypothesised that these words form a collocation or construction. In the most frequent sequences (i.e. in the context of an adjacent ich ‘I’), verb-final schwa is less likely to be realised than in less frequent sequences without ich – indeed, we find very few realisations with schwa in these frequent sequences, supporting Hanique et al. (2010) and Kohler (1990). As suggested by Hall (1999), the verb stem followed by ich forms a single prosodic word, evidenced by the resyllabification in [hab̥ɪç]. Our results indicate that verb-final schwas are more frequently realised in less predictable contexts, suggesting that frequent sequences form strong links that hinder schwa realisations. This effect could be due to both cliticisation or higher predictability in more frequently occurring contexts. Cliticisation, however, would only explain cases with following ich that are realised without schwa (94.8 % of 308) and not cases with preceding ich (76.5 % of 412). It is possible that cliticisation effects and predictability effects interact, with more clitics in more predictable contexts leading to fewer realisations with schwa. Cliticisation might also explain the significant interaction between stress and following vowels with less frequent realisations before an unstressed vowel, most of which are the vowel in ich (292 out of 345, 95.9 % without schwa). In contrast, in less frequent sequences (cases without an adjacent ich) speakers seem to be more flexible and vary more in their schwa realisations. This tendency, however, holds true for free conversations only. In the Diapix task, we find very few cases without adjacent ich (n = 9, six cases without schwa), which hardly allows conclusions to be drawn. Generally, one of the shortcomings of our study is that some cases occur very rarely (e.g. three cases of the preterite in the Diapix task), rendering our analysis vulnerable to individual idiosyncrasies.

Further, paradigmatic compensation does not significantly affect the distribution of schwa realisation in our study. Paradigmatic compensation predicts more frequent schwa realisations for stems ending in nasals or obstruents, e.g. hab-, glaub- and find-, compared to stems with final vowels, diphthongs or liquids, e.g. geh-, bau- (Eisenberg 2013; Raffelsiefen 1995). Since schwas are rarely realised in the most frequent verb haben, we can assume that the paradigmatic compensation does not play a role in unscripted spontaneous speech registers. The examples above also give evidence that avoidance of final devoicing is irrelevant for the speakers investigated in our study. Similarly, the effect of rhythmicity, i.e. a preference for alternating stress patterns (see Kentner 2018), does not affect schwa realisation per se. However, inflected verbs have a significantly higher probability of schwa realisation when the following syllable starts with a vowel and is stressed rather than unstressed. Apart from this case, speakers in our study are not sensitive to stress clash and do not strive for a repair by deleting or inserting a schwa. Up to now, evidence for this factor is restricted to written texts and read speech for German (Fleischer et al. 2018; Kentner 2018).

5 Conclusions

In general, word-final schwa is rarely realised in verbal inflectional suffixes, a phenomenon which we did not expect to be so pronounced. This indicates that realisations without schwa are more common in German spontaneous speech than realisations with schwa – implying the need to reconsider standards and norms in spontaneous speech, especially in second language acquisition and automatic speech processing: Word-final schwa in verbal inflectional suffixes in present tense should be taught as at least optional if not omissible. While suprasegmental, morphosyntactic, and frequency factors influence schwa realisation, register, too, proves to be a crucial factor that needs to be considered in modelling. As argued in Lüdeling et al. (2022), even very fine-grained, functional differences in register affect the distribution of register features, which in turn influence the realisation of word-final schwa. Thus, when analysing seemingly free variation of (morpho)phonetic phenomena, i.e. variation that is not motivated by semantic, pragmatic or social-meaning differences, the factor situation, through which at least grammatical factors (e.g. tense) are mediated, must also be taken into account. It is essential to distinguish the influence of conventionalised text genres on schwa realisation from the situation or register, particularly when more regular speech and a constant rhythm are required, such as in poetry. Furthermore, the relationship between articulation rate and ‘reduction’ phenomena also seems to vary with register. It will be a fruitful future endeavour to investigate this relationship for different registers varying with respect to formality, social distance, or addressee.


Corresponding author: Robert Lange, Department of German Studies and Linguistics, Humboldt-Universität zu Berlin, Berlin, Germany, E-mail:

Acknowledgements

The present paper originated in the context of the CRC 1412 Register. We would like to express our gratitude to Felix Golcher and Daniela Palleschi for their support and guidance in the statistical analysis, and to Miriam Müller and Torben Schilling for their help in annotating the data. We also thank the editor and two anonymous reviewers for their substantial comments, as well as Qiang Xia and Tom Offrede for their constructive feedback.

  1. Research funding: Funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – SFB 1412, 416591334.

References

Altinok, Duygu. 2018. DEMorphy, German language morphological analyzer. arXiv:1803.00902 [cs]. http://arxiv.org/abs/1803.00902 (accessed 7 July 2024).Search in Google Scholar

Baker, Rachel & Valerie Hazan. 2011. DiapixUK: Task materials for the elicitation of multiple spontaneous speech dialogs. Behavior Research Methods 43. 761–770. https://doi.org/10.3758/s13428-011-0075-y.Search in Google Scholar

Barton, Kamil. 2018. MuMIn: Multi-model inference. https://CRAN.Rproject.org/package=MuMIn (accessed 7 July 2024).Search in Google Scholar

Bates, Douglas, Martin Mächler, Ben Bolker & Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Search in Google Scholar

Bell, Allan. 1984. Language style as audience design. Language in Society 13(2). 145–204. https://doi.org/10.1017/s004740450001037x.Search in Google Scholar

Bell, Alan, Daniel Jurafsky, Eric Fosler-Lussier, Cynthia Girand, Michelle Gregory & Daniel Gildea. 2003. Effects of disfluencies, predictability, and utterance position on word form variation in English conversation. Journal of the Acoustical Society of America 113(2). 1001–1024. https://doi.org/10.1121/1.1534836.Search in Google Scholar

Bell, Alan, Jason M. Brenier, Michelle Gregory, Cynthia Girand & Dan Jurafsky. 2009. Predictability effects on durations of content and function words in conversational English. Journal of Memory and Language 60(1). 92–111. https://doi.org/10.1016/j.jml.2008.06.003.Search in Google Scholar

Belz, Malte. 2021. Die Phonetik von äh und ähm: Akustische Variation von Füllpartikeln im Deutschen. Berlin: Springer.10.1007/978-3-662-62812-6Search in Google Scholar

Belz, Malte & Christine Mooshammer. 2023. Berlin Dialogue Corpus (BeDiaCo): Version 3. Medien-Repositorium, Humboldt-Universität zu Berlin. https://rs.cms.hu-berlin.de/phon (accessed 7 July 2024).Search in Google Scholar

Belz, Malte, Oksana Rasskazova, Jelena Krivokapić & Christine Mooshammer. 2022. Interaction between phrasal structure and vowel tenseness in German: An acoustic and articulatory study. Language and Speech 66(1). 3–34. https://doi.org/10.1177/00238309211064857.Search in Google Scholar

Belz, Malte, Alina Zöllner, Lea-Sophie Adam & Christine Mooshammer. 2021. Berlin Dialogue Corpus (BeDiaCo): Version 2. Medien-Repositorium, Humboldt-Universität zu Berlin. https://rs.cms.hu-berlin.de/phon (accessed 7 July 2024).Search in Google Scholar

Belz, Malte, Alina Zöllner, Megumi Terada, Robert Lange, Lea-Sophie Adam & Bianca Sell. 2023. Dokumentation und Annotationsrichtlinien für das Korpus BeDiaCo v3. Geneva: Zenodo. https://doi.org/10.5281/zenodo.8142681.Search in Google Scholar

Biber, Douglas & Susan Conrad. 2019. Register, genre, and style, 2nd edn. Cambridge: Cambridge University Press.10.1017/9781108686136Search in Google Scholar

Brand, Sophie & Mirjam Ernestus. 2018. Listeners’ processing of a given reduced word pronunciation variant directly reflects their exposure to this variant: Evidence from native listeners and learners of French. Quarterly Journal of Experimental Psychology 71(5). 1240–1259. https://doi.org/10.1080/17470218.2017.1313282.Search in Google Scholar

Browman, Catherine P. & Louis Goldstein. 1990. Tiers in articulatory phonology, with some implications for casual speech. In John Kingston & Mary Beckman (eds.), Papers in laboratory phonology I: Between the grammar and physics of speech, 341–376. Cambridge: Cambridge University Press.10.1017/CBO9780511627736.019Search in Google Scholar

Bürki, Audrey. 2018. Variation in the speech signal as a window into the cognitive architecture of language production. Psychonomic Bulletin & Review 25. 1973–2004. https://doi.org/10.3758/s13423-017-1423-4.Search in Google Scholar

Bürki, Audrey, Mirjam Ernestus, Cédric Gendrot, Cécile Fougeron & Ulrich Hans Frauenfelder. 2011. What affects the presence versus absence of schwa and its duration: A corpus analysis of French connected speech. Journal of the Acoustical Society of America 130(6). 3980–3991. https://doi.org/10.1121/1.3658386.Search in Google Scholar

Bybee, Joan. 2001. Phonology and language use. Cambridge: Cambridge University Press.10.1017/CBO9780511612886Search in Google Scholar

Byrd, Dani & Jelena Krivokapić. 2021. Cracking prosody in articulatory phonology. Annual Review of Linguistics 7(1). 31–53. https://doi.org/10.1146/annurev-linguistics-030920-050033.Search in Google Scholar

Cho, Taehong. 2011. Laboratory phonology. In Nancy C. Kula, Engbert D. Botma & Kuniya Nasukawa (eds.), The continuum companion to phonology, 343–368. London: Continuum International.Search in Google Scholar

Croft, William. 2010. The origins of grammaticalization in the verbalization of experience. Linguistics 48(1). 1–48. https://doi.org/10.1515/ling.2010.001.Search in Google Scholar

Cuba, Johannes von. 1487. Gart der Gesundheit [Garden of health]. Ulm.Search in Google Scholar

Davidson, Lisa. 2006. Schwa elision in fast speech: Segmental deletion or gestural overlap? Phonetica 63(2–3). 79–112. https://doi.org/10.1159/000095304.Search in Google Scholar

Dell, Gary S. 1986. A spreading-activation theory of retrieval in sentence production. Psychological Review 93(3). 283–321. https://doi.org/10.1037//0033-295x.93.3.283.Search in Google Scholar

Duden. 2006. Die Grammatik, 7th edn. (Der Duden in zwölf Bänden 4). Mannheim: Bibliographisches Institut & F. A. Brockhaus AG.Search in Google Scholar

Duden. 2022. Die Grammatik: Struktur und Verwendung der deutschen Sprache. Sätze – Wortgruppen – Wörter, 10th edn. (Der Duden in zwölf Bänden 4). Berlin: Dudenverlag.Search in Google Scholar

Eckert, Penelope & William Labov. 2017. Phonetics, phonology and social meaning. Journal of Sociolinguistics 21(4). 467–496. https://doi.org/10.1111/josl.12244.Search in Google Scholar

Eisenberg, Peter. 1991. Syllabische Struktur und Wortakzent: Prinzipien der Prosodik deutscher Wörter. Zeitschrift für Sprachwissenschaft 10(1). 37–64. https://doi.org/10.1515/zfsw.1991.10.1.37.Search in Google Scholar

Eisenberg, Peter. 2013. Grundriss der deutschen Grammatik: Das Wort, 4th edn, vol. 1. Stuttgart: Metzler.10.1007/978-3-476-00743-8_1Search in Google Scholar

Ernestus, Mirjam. 2014. Acoustic reduction and the roles of abstractions and exemplars in speech processing. Lingua 142. 27–41. https://doi.org/10.1016/j.lingua.2012.12.006.Search in Google Scholar

Ernestus, Mirjam & Rachel Smith. 2018. Qualitative and quantitative aspects of phonetic variation in Dutch eigenlijk. In Francesco Cangemi, Meghan Clayards, Oliver Niebuhr, Barbara Schuppler & Margaret Zellers (eds.), Rethinking reduction: Interdisciplinary perspectives on conditions, mechanisms, and domains for phonetic variation (Phonology and Phonetics 25), 129–163. Berlin: De Gruyter.10.1515/9783110524178-005Search in Google Scholar

Ernestus, Mirjam, Iris Hanique & Erik Verboom. 2015. The effect of speech situation on the occurrence of reduced word pronunciation variants. Journal of Phonetics 48. 60–75. https://doi.org/10.1016/j.wocn.2014.08.001.Search in Google Scholar

Fleischer, Jürg, Michael Cysouw, Augustin Speyer & Richard Wiese. 2018. Variation and its determinants: A corpus-based study of German schwa in the letters of Goethe. Zeitschrift für Sprachwissenschaft 37(1). 55–81. https://doi.org/10.1515/zfs-2018-0002.Search in Google Scholar

Fosler-Lussier, Eric & Nelson Morgan. 1999. Effects of speaking rate and word frequency on pronunciations in conver[sa]tional speech. Speech Communication 29(2–4). 137–158. https://doi.org/10.1016/s0167-6393(99)00035-7.Search in Google Scholar

Fox, John & Sanford Weisberg. 2019. An R companion to applied regression, 3rd edn. Thousand Oaks: Sage. https://socialsciences.mcmaster.ca/jfox/Books/Companion/ (accessed 7 July 2024).Search in Google Scholar

Grice, Martine, Stefan Baumann & Ralf Benzmüller. 2005. German intonation in autosegmental-metrical phonology. In Sun-Ah Jun (ed.), Prosodic typology, 55–83. Oxford: Oxford University Press.10.1093/acprof:oso/9780199249633.003.0003Search in Google Scholar

Hall, T. Alan. 1999. Phonotactics and the prosodic structure of German function words. Amsterdam Studies in the Theory and History of Linguistic Science Series 4. 99–132. https://doi.org/10.1075/cilt.174.06hal.Search in Google Scholar

Halliday, Michael A. K. 1978. Language as social semiotic: The social interpretation of language and meaning. London: Edward Arnold.Search in Google Scholar

Hanique, Iris, Barbara Schuppler & Mirjam Ernestus. 2010. Morphological and predictability effects on schwa reduction: The case of Dutch word-initial syllables. Annual Conference of the International Speech Communication Association 11, 933–936.10.21437/Interspeech.2010-315Search in Google Scholar

Hoole, Philip & Christine Mooshammer. 2002. Articulatory analysis of the German vowel system. In Peter Auer, Peter Gilles & Helmut Spiekermann (eds.), Silbenschnitt und Tonakzente, 129–152. Tübingen: Max Niemeyer.10.1515/9783110916447.129Search in Google Scholar

Hume, Elizabeth. 2004. Deconstructing markedness: A predictability-based approach. Annual Meeting of the Berkeley Linguistics Society 30 [General session and parasession on conceptual structure and cognition in grammatical theory], 182–198.10.3765/bls.v30i1.948Search in Google Scholar

Johnson, Keith. 2004. Massive reduction in conversational American English. In Kiyoko Yoneyama & Kikuo Maekawa (eds.), Spontaneous speech: Data and analysis. Proceedings of the 1st session of the 10th International Symposium, 29–54. Tokyo: National International Institute for Japanese Language.Search in Google Scholar

Johnston, Ron, Kelvyn Jones & David Manley. 2018. Confounding and collinearity in regression analysis: A cautionary tale and an alternative procedure, illustrated by studies of British voting behaviour. Quality and Quantity 52(4). 1957–1976. https://doi.org/10.1007/s11135-017-0584-6.Search in Google Scholar

Jurafsky, Dan, Allan Bell, Michelle Gregory & William Raymond. 2001. The effect of language model probability on pronunciation reduction. IEEE International Conference on Acoustics, Speech, and Signal Processing 2. 801–804. https://doi.org/10.1109/ICASSP.2001.941036.Search in Google Scholar

Kentner, Gerrit. 2018. Schwa optionality and the prosodic shape of words and phrases. In Christiane Ulbrich, Alexander Werth & Richard Wiese (eds.), Empirical approaches to the phonological structure of words (Linguistische Arbeiten 567), 121–151. Berlin: De Gruyter.10.1515/9783110542899-006Search in Google Scholar

Kienast, Miriam & Walter F. Sendlmeier. 2000. Acoustical analysis of spectral and temporal changes in emotional speech. Paper presented at ISCA Tutorial and Research Workshop on Speech and Emotion, Newcastle, 5–7 September.Search in Google Scholar

Kohler, Klaus J. 1990. Segmental reduction in connected speech in German: Phonological facts and phonetic explanations. In William J. Hardcastle & Alain Marchal (eds.), Speech production and speech modelling, 69–92. Dordrecht: Springer.10.1007/978-94-009-2037-8_4Search in Google Scholar

Kohler, Klaus J. & Jonathan E. J. Rodgers. 2001. Schwa deletion in German read and spontaneous speech. Arbeitsberichte des Instituts für Phonetik und digitale Sprachverarbeitung der Universität Kiel 35. 97–123.Search in Google Scholar

Koreman, Jacques. 2006. Perceived speech rate: The effects of articulation rate and speaking style in spontaneous speech. Journal of the Acoustical Society of America 119(1). 582–596. https://doi.org/10.1121/1.2133436.Search in Google Scholar

Kuznetsova, Alexandra, Per B. Brockhoff & Rune H. B. Christensen. 2017. lmerTest package: Tests in linear mixed effects models. Journal of Statistical Software 82(13). 1–26. https://doi.org/10.18637/jss.v082.i13.Search in Google Scholar

Labov, William. 1963. Social motivations of a sound change. Word 19. 273–309. https://doi.org/10.1080/00437956.1963.11659799.Search in Google Scholar

Labov, William. 1971. The study of language in its social context. In Joshua Fishman (ed.), Advances in the sociology of language, vol. 1, 152–216. The Hague: De Gruyter Mouton.Search in Google Scholar

Labov, William. 2006. The social stratification of English in New York city, 2nd edn. Cambridge: Cambridge University Press.10.1017/CBO9780511618208Search in Google Scholar

Lange, Robert. 2021. TierTagger: Early version. https://scm.cms.hu-berlin.de/langerob/tiertagger-early-version (accessed 8 July 2024).Search in Google Scholar

Lindblom, Björn. 1963. Spectrographic study of vowel reduction. Journal of the Acoustical Society of America 35(5). 783. https://doi.org/10.1121/1.2142410.Search in Google Scholar

Lindblom, Björn. 1990. Explaining phonetic variation: A sketch of the H&H theory. In William J. Hardcastle & Alain Marchal (eds.), Speech production and speech modelling, 403–439. Dordrecht: Springer.10.1007/978-94-009-2037-8_16Search in Google Scholar

Lüdecke, Daniel. 2018. Ggeffects: Tidy data frames of marginal effects from regression models. Journal of Open Source Software 3(26). 772. https://doi.org/10.21105/joss.00772.Search in Google Scholar

Lüdeling, Anke, Artemis Alexiadou, Aria Adli, Karin Donhauser, Malte Dreyer, Markus Egg, Anna Helene Feulner, Natalia Gagarina, Wolfgang Hock, Stefanie Jannedy, Frank Kammerzell, Pia Knoeferle, Thomas Krause, Manfred Krifka, Silvia Kutscher, Beate Lütke, Thomas McFadden, Roland Meyer, Christine Mooshammer, Stefan Müller, Katja Maquate, Muriel Norde, Uli Sauerland, Stephanie Solt, Luka Szucsich, Elisabeth Verhoeven, Richard Waltereit, Anne Wolfsgruber & Lars Erik Zeige. 2022. Register: Language users’ knowledge of situational-functional variation: Frame text of the first phase proposal for the CRC 1412. Register Aspects of Language in Situation 1. 1–58. https://doi.org/10.18452/24901.Search in Google Scholar

Michalke, Meik. 2017. sylly.de: Language support for ‘sylly’ package: German, Version 0.1-2. https://github.com/unDocUMeantIt/sylly (accessed 8 July 2024).10.32614/CRAN.package.sylly.enSearch in Google Scholar

Michalke, Meik. 2020. sylly: Hyphenation and syllable counting for text analysis, Version 0.1-6. https://github.com/unDocUMeantIt/sylly (accessed 8 July 2024).Search in Google Scholar

Nakagawa, Shinichi & Holger Schielzeth. 2013. A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution 4(2). 133–142. https://doi.org/10.1111/j.2041-210x.2012.00261.x.Search in Google Scholar

Niebuhr, Oliver, Karin Görs & Evelin Graupe. 2013. Speech reduction, intensity, and F0 shape are cues to turn-taking. Annual Meeting of the Special Interest Group on Discourse and Dialogue 14, 261–269.Search in Google Scholar

Ohala, John J. 1989. Sound change is drawn from a pool of synchronic variation. In Leiv Egil Breivik & Ernst Håkon Jahr (eds.), Language change: Contributions to the study of its causes, 173–198. Berlin: Mouton de Gruyter.10.1515/9783110853063.173Search in Google Scholar

Paschen, Ludger, Susanne Fuchs & Frank Seifart. 2022. Final lengthening and vowel length in 25 languages. Journal of Phonetics 94. 101179. https://doi.org/10.1016/j.wocn.2022.101179.Search in Google Scholar

Pinnow, Eleni, Cynthia M. Connine & Larissa J. Ranbom. 2017. Processing pronunciation variants: The role of probabilistic knowledge about lexical form and segmental co-occurrence. Journal of Cognitive Psychology 29(4). 393–403. https://doi.org/10.1080/20445911.2017.1279619.Search in Google Scholar

Piroth, Hans G. & Peter M. Janker. 2004. Speaker-dependent differences in voicing and devoicing of German obstruents. Journal of Phonetics 32(1). 81–109. https://doi.org/10.1016/s0095-4470(03)00008-1.Search in Google Scholar

Pluymaekers, Mark, Mirjam Ernestus & R. Harald Baayen. 2005. Lexical frequency and acoustic reduction in spoken Dutch. Journal of the Acoustical Society of America 118(4). 2561–2569. https://doi.org/10.1121/1.2011150.Search in Google Scholar

Pluymaekers, Mark, Mirjam Ernestus & R. Harald Baayen. 2006. Effects of word frequency on the acoustic durations of affixes. Annual Conference of the International Speech Communication Association 7, 953–956.10.21437/Interspeech.2006-307Search in Google Scholar

Raffelsiefen, Renate. 1995. Conditions for stability: The case of schwa in German. Arbeiten des SFB 282, Theorie des Lexikons, 69.Search in Google Scholar

Raymond, William D., Robin Dautricourt & Elizabeth Hume. 2006. Word-internal /t,d/ deletion in spontaneous speech: Modeling the effects of extra-linguistic, lexical, and phonological factors. Language Variation and Change 18(1). 55–97. https://doi.org/10.1017/s0954394506060042.Search in Google Scholar

Schiel, Florian. 1999. Automatic phonetic transcription of non-prompted speech. International Congress of Phonetic Sciences 14. 607–610.Search in Google Scholar

Schmid, Helmut. 1994. Probabilistic part-of-speech tagging using decision trees. Paper presented at the Conference on New Methods in Language Processing, Manchester, 6–8 July.Search in Google Scholar

Selkirk, Elisabeth. 1984. Phonology and syntax: The relation between sound and structure. Cambridge: MIT Press.Search in Google Scholar

Siebenhaar, Beat. 2020. Informalitätsmarkierung in der WhatsApp- Kommunikation. In Jannis Androutsopoulos & Florian Busch (eds.), Variation, Interaktion und Reflexion in der digitalen Schriftlichkeit, 67–92. Berlin: De Gruyter.10.1515/9783110673241-004Search in Google Scholar

Storrer, Angelika. 2018. Interaktionsorientiertes Schreiben im Internet. In Arnulf Deppermann & Silke Reineke (eds.), Sprache im kommunikativen, interaktiven und kulturellen Kontext, 219–244. Berlin: De Gruyter.10.1515/9783110538601-010Search in Google Scholar

Szmrecsanyi, Benedikt. 2019. Register in variationist linguistics. Register Studies 1(1). 76–99. https://doi.org/10.1075/rs.18006.szm.Search in Google Scholar

Szulc, Aleksander. 2014. Historische Phonologie des Deutschen. Tübingen: Max Niemeyer.Search in Google Scholar

Van Engen, Kristin, Melissa Baese-Berk, Rachel E. Baker, Arim Choi, Midam Kim & Ann R. Bradlow. 2010. The wildcat corpus of native- and foreign-accented English: Communicative efficiency across conversational dyads with varying language alignment profiles. Language and Speech 53(4). 510–540. https://doi.org/10.1177/0023830910372495.Search in Google Scholar

Van Son, Rob J. J. H. & Louis C. W. Pols. 1990. Formant frequencies of Dutch vowels in a text, read at normal and fast rate. Journal of the Acoustical Society of America 88(4). 1683–1693. https://doi.org/10.1121/1.400243.Search in Google Scholar

Van Son, Rob J. J. H. & Louis C. W. Pols. 1992. Formant movements of Dutch vowels in a text, read at normal and fast rate. Journal of the Acoustical Society of America 92(1). 121–127. https://doi.org/10.1121/1.404277.Search in Google Scholar

Watson, Sam, Anna J. Sørensen & Ewen MacDonald. 2020. The effect of conversational task on turn taking in dialogue. International Symposium on Auditory and Audiological Research 7, 61–68. https://proceedings.isaar.eu/index.php/isaarproc/article/view/2019-08 (accessed 1 August 2024).Search in Google Scholar

Weissweiler, Leonie & Alexander Fraser. 2018. Developing a stemmer for German based on a comparative analysis of publicly available stemmers. In Georg Rehm & Thierry Declerck (eds.), Language technologies for the challenges of the digital age, vol. 10713, 81–94. Cham: Springer.10.1007/978-3-319-73706-5_8Search in Google Scholar

Wesener, Thomas. 1999. The phonetics of function words in German spontaneous speech. In Klaus J. Kohler (ed.), Phrase-level phonetics and phonology, 327–377. Kiel: Universität Kiel.Search in Google Scholar

Westpfahl, Swantje. 2014. STTS 2.0? Improving the tagset for the part-ofspeech-tagging of German spoken data. In Lori Levin & Manfred Stede (eds.), Proceedings of LAW VIII: The 8th Linguistic Annotation Workshop, 1–10. Dublin: Association for Computational Linguistics & Dublin City University.10.3115/v1/W14-4901Search in Google Scholar

Westpfahl, Swantje, Thomas Schmidt, Jasmin Jonietz & Anton Borlinghaus. 2017. STTS 2.0: Guidelines für die Annotation von POS-Tags für Transkripte gesprochener Sprache in Anlehnung an das Stuttgart Tübingen Tagset (STTS). Mannheim: Institut für Deutsche Sprache working paper. http://nbn-resolving.de/urn:nbn:de:bsz:mh39-60634.Search in Google Scholar

Wieling, Martijn. 2012. A quantitative approach to social and geographical dialect variation. Groningen: Rijksuniversiteit Groningen dissertation.Search in Google Scholar

Winkelmann, Raphael, Klaus Jaensch, Steve Cassidy & Jonathan Harrington. 2021. emuR: Main package of the EMU speech database management system. R package Version 2.3.0.Search in Google Scholar

Zimmerer, Frank, Mathias Scharinger & Henning Reetz. 2011. When BEAT becomes HOUSE: Factors of word final /t/-deletion in German. Speech Communication 53(6). 941–954. https://doi.org/10.1016/j.specom.2011.03.006.Search in Google Scholar

Zimmerer, Frank, Mathias Scharinger & Henning Reetz. 2014. Phonological and morphological constraints on German /t/-deletions. Journal of Phonetics 45. 64–75. https://doi.org/10.1016/j.wocn.2014.03.006.Search in Google Scholar

Received: 2022-11-10
Accepted: 2024-05-24
Published Online: 2024-08-15
Published in Print: 2024-11-26

© 2024 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 24.4.2026 from https://www.degruyterbrill.com/document/doi/10.1515/zfs-2024-2011/html?lang=en
Scroll to top button