Startseite Consequences of prosodic variation for spatiotemporal organization in Spanish stop-lateral clusters
Artikel Open Access

Consequences of prosodic variation for spatiotemporal organization in Spanish stop-lateral clusters

  • Stavroula Sotiropoulou ORCID logo EMAIL logo und Adamantios Gafos
Veröffentlicht/Copyright: 6. September 2024

Abstract

Using articulatory data from five Spanish speakers, we study how stop-lateral-vowel sequences respond to perturbations of phonetic parameters in the segments that compose them. Target words with stop-lateral complex onsets were embedded in different prosodic contexts. Regardless of prosodic context, stability-based indices for the presumed global organization of the elicited stop-lateral complex onsets do not show the expected patterns. Nevertheless, evidence for global organization does emerge in the presence of compensatory relations among phonetic parameters which vary as a result of prosodic modulations. Thus, as stop duration and the lag between the two consonants in a cluster increases, the vowel begins earlier in relation to the preceding lateral. Similarly, as the lag in the stop-lateral transition increases, lateral duration decreases. That is, local changes to some part of the sequence produces (language-particular) compensatory effects propagating to other parts of the sequence, in attestation of the global organization presiding over the entire stop-lateral-vowel sequence. We relate our results to independently observed properties usually associated with the idea that Spanish is a syllable-timed language and draw implications for the link between qualitative phonological organization and continuous phonetics.

1 Introduction

This paper examines effects of syllabic organization on the articulation of Spanish word-initial stop-lateral-vowel sequences. Articulatory movement data from stimuli with the stop-lateral clusters /bl, gl, pl, kl/, in target words like /plato/, /blata/ and so on, and their respective singleton consonant words starting with a lateral, as in /lato/, were elicited from five speakers of Central Peninsular Spanish. Stop-lateral clusters are prototypical syllable onsets in Spanish and other languages. As such, if syllabic organization is lawfully related to articulation, they are expected to show effects of being affiliated, as a whole, with their tautosyllabic vowel. What exactly this means has been an issue of considerable attention. In a landmark study, Browman and Goldstein (1988) first proposed that syllabic organization can be assessed with articulatory data by measures of temporal stability over intervals coextensive with the respective phonetic strings whose organization is at issue. Two patterns of interval stability have emerged, each thought to be characteristic of a particular syllabic organization. In languages like English that admit more than one consonant as part of a syllable onset, also known as complex onsets, it is often reported that the most stable interval across CVC, CCVC and CCCVC utterances (where C is any consonant and V is any vowel) is an interval defined by the center of the entire prevocalic consonantal string, also referred to as the c-center, and the end of the hypothesized syllable (Browman and Goldstein 1988; Byrd 1995; Honorof and Browman 1995; Marin and Pouplier 2010; Shaw and Gafos 2015). The c-center is computed by taking the mean of the midpoints of every prevocalic consonant. We thus refer to the interval spanning from the c-center to the end of the vowel as the global timing interval (following Gafos et al. 2020). In contrast, it has been shown that in languages that do not admit complex onsets such as Arabic (Shaw et al. 2009), the most stable interval across CVC, CCVC and CCCVC utterances is defined by the immediately prevocalic consonant and the end of the hypothesized syllable (see also Goldstein et al. 2007; Hermes et al. 2015 on Berber). Because this latter interval is defined by making reference only to the consonant immediately (that is, local) to the left of the vowel, we refer to that interval as the local timing interval. Thus, the relative stabilities of these intervals seem to change depending on the language-particular syllabic structure. These patterns of interval stabilities, global timing versus local timing stability, have thus been considered to be the phonetic correlates of different syllabic structures. We henceforth refer to these phonetic, quantitative correlates as the stability-based heuristics for syllabic structure.

These stability-based heuristics have been used in follow-up studies to assess syllabic structure in different languages and segmental contexts (Brunner et al. 2014; Hermes et al. 2013, 2017; Marin 2013; Pouplier 2012; Pouplier and Benus 2011; Shaw et al. 2009; Shaw et al. 2011 using articulatory data: Shaw and Gafos 2015 and more recently Durvasula et al. 2021 using acoustic data). However, in several languages, hypothesized complex onsets have not provided consistent evidence for the expected stability of the global timing interval (Pouplier and Benus 2011 on Slovak; Brunner et al. 2014; Pouplier 2012 on German; Marin 2013 on Romanian; Hermes et al. 2017 on Polish). Similarly, the expected stability of the local timing interval in Moroccan Arabic has not been consistently observed (Shaw et al. 2009, 2011). Therefore, a number of studies on the relation between syllabic organization and spatiotemporal coordination patterns have shown that the stability-based heuristics related to syllabic organization often break down. These results largely challenge the idea that syllabic organization has consistent phonetic manifestations in the articulatory record. In sum, what appeared at first as a promising hypothesis has in more recent work been questioned by several sub-results which uncover non-uniform patterns depending on the clusters examined and the phonetic properties of the segments in those clusters. Granting that cluster-specific phonetics affects the temporal coordination patterns between segments, what role then, if any, does syllabic structure play in articulation?

Recently, on the basis of data from German, one of the languages where stability-based heuristics have been unsuccessful in diagnosing syllabic organization (Brunner et al. 2014; Pouplier 2012), Sotiropoulou and Gafos (2022) have argued that other articulatory correlates of global organization emerge when the segmental strings over which syllabic organization is assessed are placed in varying prosodic contexts. Adapting an experimental design from Byrd and Choi (2010) on English, Sotiropoulou and Gafos (2022) studied how segmental sequences respond to perturbations of phonetic parameters in the segments that compose them. To induce these perturbations, the same stop-lateral-vowel sequence was produced phrase medially, preceded by a word boundary, but also utterance initially. Following insights from human movement science (Gracco and Abbs 1988: 523; Schöner 2002: 48), Sotiropoulou and Gafos (2022) reasoned that if a stop-lateral-vowel sequence is organized as a group (a syllable), when one segmental component of that group is somehow perturbed (via spontaneous fluctuations in its execution or application of some delay), other component movements should reconfigure spatiotemporally so as to maintain their global organization. Using this diagnostic, Sotiropoulou and Gafos (2022) showed that when CCV sequences such as /kla/ are placed in an utterance initial position, the first consonant and the interplateau interval (lag between /k/ release and /l/ target) both expand (by lengthening of their durations), extending the results on English in Byrd and Choi (2010). When this happens, the lag between the lateral and the vowel shortens. That is, as the initial CC part of the CCV sequence expands, due to prosodic strengthening by its placement next to the utterance boundary, the inner CV subsequence compresses to compensate for the expansion of the initial part of the globally organized whole. Crucially, this effect was not seen when these sequences were placed in phrase medial position where a (weaker) word boundary (compared to an utterance boundary) precedes the target word. This explains why prior work on German, which did not include a prosodic boundary manipulation, had not succeeded in finding evidence for global organization (Brunner et al. 2014). In following this approach from Sotiropoulou and Gafos (2022), the present study on Spanish employs prosodic variation in order to expand our understanding, both empirically and conceptually, of the relation between syllabic organization and the phonetic spatiotemporal manifestation of that organization.

This paper is organized as follows. We begin in Section 2 by outlining the method and stimuli used for the Spanish articulatory data. A detailed analysis of the data, documenting how phonetic parameters change under prosodic strengthening and the ensuing spatiotemporal coordination patterns in stop-lateral sequences follows in Section 3. Section 4 summarizes the results and argues that effects of global organization in Spanish stop-laterals are expressed in relational properties of phonetic parameters rather than in global timing stability. Additionally, Section 4 relates these effects of global organization to independently observed properties of Spanish that make reference to the distinction between syllable-timed versus stress-timed languages (see Aldrich and Simonet 2019; Cuenca 1996; Dauer 1983). We conclude in Section 5 with implications of our results for the issue of the relation between syllable structure and phonetic indices. The main lesson learned is that the phonetic realization of global organization is not invariably manifested in terms of some privileged index (as in, for instance, global timing interval stability) common across clusters and languages, but rather in terms of compensatory relations among phonetic parameters; these relations become evident when the sequences whose global organization is at issue are perturbed as a result of prosodic modulations.

2 Methods

2.1 Subjects

Articulatory data using the Carstens AG501 Articulograph were collected from five native Spanish subjects (vp01, vp02, vp03, vp04, vp05) between 20 and 45 years old.[1] All subjects reported no speech or hearing problems. They provided written informed consent prior to the investigation and they were reimbursed for their participation. The experiment took place at the Speech Lab of the University of Potsdam. All experimental procedures were approved by the Ethics Committee of the authors’ University (application number 62/2016).

2.2 Speech material

The corpus consists of real disyllabic words in Spanish starting with consonant clusters (CC) or single consonants (C) with stress on the initial syllable. CC-initial words begun with stop-lateral clusters. Their paired single consonant-initial words begun with a lateral such that in a CV∼CCV pair the prevocalic consonant remained the same across CV∼CCV words (e.g., /lato/∼/plato/). The word-initial stop-lateral clusters consist of /bl, gl, pl, kl/ where the initial stop is either a voiced or a voiceless stop. The vowel following the cluster or the single consonant of interest is a low vowel /a/ or a mid vowel /e, o/. The postvocalic consonant was maintained if possible the same across the two words within a pair. In case this was not possible then either manner of articulation or voicing of the postvocalic consonant was maintained the same within a CV∼CCV pair.

Each stimulus word was recorded in two prosodic conditions with different boundary strength (word boundary versus utterance boundary) preceding the stimulus word. The boundary strength is increasing from word boundary to utterance boundary. The word boundary condition was elicited by embedding the stimulus word in the carrier phrase Aparece ____ por alli (‘The word ___ appears over there’) with the stimulus word location indicated by the “____”. The utterance boundary condition was elicited by embedding the stimulus word in the carrier phrase Primero vi a Ana. ____ era su respuesta (‘First I saw Anna. ___ was her response’). During the experimental session, these phrases were presented in a randomized order; there were ten or eleven blocks per session so that each session lasted approximately 2 h. Thus, each subject produced ten or eleven repetitions of each item (N = 19) in two prosodic conditions yielding a total of approximately 380 tokens per subject. Table 1 presents the list of complex onsets (CCV) and the respective singleton (CV) words.

Table 1:

Spanish stimuli.

low vowel (/a/) mid vowel (/e/, /o/)
Cluster CCV CV CCV CV
pl /plato/ ‘plate’/lato/ ‘to whip’ (latir) /plena/ ‘full’

/plomo/ ‘lead’
/lena/ (proper name)

/lomo/ ‘loin’
bl /blata/ (model of motorcycle) /lato/ ‘to whip’ /bleke/ ‘tar’

/bloke/ ‘block’
/leko/ ‘nuts’

/loko/ ‘crazy’
gl /glato/ (province in Italy) /lato/ ‘to whip’ /gleba/ ‘mound of land’

/globo/ ‘balloon, globe’
/lema/ ‘slogan/motto’

/lomo/ ‘loin’
kl /klapas/ ‘stripping of unfertile land’ /lapa/ (type of clam) /klema/ ‘electrical connector’

/klono/ ‘clone’
/lema/ ‘slogan/motto’

/lomo/ ‘loin’

2.3 Data acquisition

The data were acquired by means of Electromagnetic Articulography (EMA) using the Carstens AG501 device. The device tracks the three-dimensional movement of sensors attached to various structures inside and outside the vocal tract. During recording, the raw positional data are stored in the computer which is connected to the articulograph. The stimuli were prompted by another computer which also triggered the articulograph to start recording. The subject sat on a chair in a sound-proof booth and was instructed to read the sentences appearing on a computer monitor at a comfortable rate. The articulatory data were recorded at a sampling rate of 250 Hz. Acoustic data were also captured by a t.bone EM 9600 unidirectional microphone with a TASCAM US-2x2 audio interface at a sampling rate of 48 kHz.

We now describe the placement of the sensors. Three sensors were placed midsagittally on the tongue: the tongue tip (TT) sensor attached 1 cm posterior to the tongue tip, the tongue mid (TM) sensor attached 2 cm posterior to the TT, and the tongue back (TB) sensor attached 2 cm posterior to the TM. Additional sensors were attached to the upper and lower lip and to the low incisors (jaw). Reference sensors were attached on the upper incisor, behind the ears (left and right mastoid) and on the bridge of the nose. In a post-processing stage, the data were corrected by subtracting the head movement captured from the reference sensors on the upper incisor and on the left and right mastoid. The data of the reference sensors were filtered using a cut-off frequency of 5 Hz, while the rest of the sensors’ data were filtered using a cut-off frequency of 20 Hz. At a final stage, the data were rotated according to the occlusal plane of each subject.

2.4 Articulatory segmentation

Articulatory segmentation consists in identifying the points in time where characteristic events such as onset of movement, achievement of target, and movement away from the target for a consonant or a vowel take place. For each consonant in the cluster of a cluster-initial or singleton consonant-initial word, the consonant(s), the subsequent vowel and the postvocalic consonant temporal landmarks were measured using the primary articulator(s) involved in their respective production. Thus, velar consonants (/k, g/) were measured using the most posterior TB sensor, coronals (/t, l, n/) using the TT sensor, and labials (/p, b, m/) using the lip aperture (LA). LA is a derivative signal using the Euclidean distance between upper and lower lip sensors. The low and mid vowels/a, e, o/following the consonant(s) of interest, were measured using the TM sensor. Landmark identification for both the consonantal and vowel gestures was based on the tangential velocities of the corresponding positional signals (or derived signals for the case of LA).

The articulatory segmentation of the data was conducted using the Matlab-based Mview software developed by Mark Tiede at Haskins Laboratories. Its segmentation algorithm first finds the peak velocities (to and from the constriction) and the minimum velocity within a user-specified zoomed in temporal range. The achievement of target (target) and the constriction release (release) landmarks were then obtained by identifying the timestamp at which velocity falls below and rises above a 20 % threshold of the local tangential velocity peaks. Figure 1 illustrates an example parse of the coronal gesture of the prevocalic lateral of the word/lato/using the TT sensor. The panels from top to bottom illustrate the acoustic signal of the zoomed in portion of the utterance along with the TT movement trajectory and the TT vertical velocity profile. The black filled box corresponds to the constriction phase, or plateau, of the gesture delimited by the target and release landmarks (left and right side of the black filled box). The left and right side of the white box indicate the initiation (onset) and end of the gesture (offset) calculated as the timestamps at which velocity rises above and falls below a 20 % threshold of local tangential velocity peaks. The landmarks peak vel to and peak vel fro indicate peak velocity towards the target and from the release landmarks respectively. The max. constriction landmark corresponds to the minimum velocity.

Figure 1: 
Parsing of a gesture using Mview. Panels from top to bottom: Acoustic signal of the zoomed in portion of the word lato, tongue tip (TT) movement trajectory in the superior-inferior dimension (vertical), tongue tip (TT) vertical velocity signal. The black filled box indicates the constriction phase, or plateau, of the gesture for the /l/ delimited by the target and release landmarks (located at the timestamps of the left and right edges of the black filled box). The (timestamps of the) left and right edges of the longer white box indicate the initiation and end of the /l/ gesture. Peak vel to, peak vel fro and max. constriction correspond to the peak velocity towards the target, away from the release and minimum velocity during the constriction phase respectively.
Figure 1:

Parsing of a gesture using Mview. Panels from top to bottom: Acoustic signal of the zoomed in portion of the word lato, tongue tip (TT) movement trajectory in the superior-inferior dimension (vertical), tongue tip (TT) vertical velocity signal. The black filled box indicates the constriction phase, or plateau, of the gesture for the /l/ delimited by the target and release landmarks (located at the timestamps of the left and right edges of the black filled box). The (timestamps of the) left and right edges of the longer white box indicate the initiation and end of the /l/ gesture. Peak vel to, peak vel fro and max. constriction correspond to the peak velocity towards the target, away from the release and minimum velocity during the constriction phase respectively.

2.5 Statistical analysis

We used R Studio version 3.3.1 (RStudio Team 2015) and the lmer package (Bates et al. 2015) to perform linear mixed effects analyses for the effect of C1 voicing (voiced, voiceless), prosodic condition (word boundary, utterance boundary), interval type (global timing, local timing) and cluster size (CV, CCV) on the dependent variables of interplateau interval, defined as the lag between C1 release and C2 target, C1 duration, C2 duration and interval duration.

Models were tested for main effects and all interactions. Subject and word were treated as random factors. Visual inspection of residual plots did not reveal any obvious deviations from homoscedasticity or normality. P-values were obtained by likelihood ratio tests of the full model with the effect in question against the model without the effect in question. For post-hoc comparisons, significance was determined using the Tukey adjusted contrast using the multcomp package (Hothorn et al. 2008) and the lsmeans package (Lenth and Hervé 2015). To address relations between continuous variables Pearson’s correlations were used.

3 Results

3.1 Interplateau interval

The interplateau interval (henceforth, IPI) in a C1C2V sequence is defined as the lag between the release of the initial consonant C1 and the target of the second consonant C2, that is, C2 target – C1 release. Positive IPIs indicate no overlap between the plateaus of the two consonants, while negative IPIs indicate overlap.

We begin by quantifying IPI as a function of prosodic condition, to examine whether our experimental design successfully elicited variability of IPI (the presence of such variability is a crucial prerequisite of our approach as developed in the ensuing sections). For the CCV sequences (N = 1063), we fitted a linear mixed effects model with IPI as a dependent variable and prosodic condition (word boundary, utterance boundary) and voicing of the initial stop (C1 voicing) (voiced, voiceless) as fixed effects. As random effects, we had random intercepts for subject and word, as well as by-subject random slopes for the effect of prosodic condition and C1 voicing and by-word random slopes for the effect of prosodic condition. Model comparisons showed a significant main effect of prosodic condition (χ 2 [1, N = 1063] = 7.85, p = 0.005) and no effect of C1 voicing on IPI. There is no interaction between C1 voicing and prosodic condition. The post-hoc analysis showed that IPI increases significantly from the word boundary condition (henceforth, wb) to the utterance boundary condition (henceforth, ut) (estimate = 9.4 ms, p = 0.00029). Furthermore, IPI is 8.4 ms larger in C1 voiceless stop-lateral clusters than in C1 voiced stop-lateral clusters across prosodic conditions but as mentioned before this result was not significant (p = 0.08). Figure 2 illustrates IPI in CCV as a function of prosodic condition and C1 voicing.

Figure 2: 
Interplateau interval (IPI) in CCV as a function of prosodic condition, word boundary (wb), utterance boundary (ut) and C1 voicing.
Figure 2:

Interplateau interval (IPI) in CCV as a function of prosodic condition, word boundary (wb), utterance boundary (ut) and C1 voicing.

To sum up, IPI is overall large, agreeing with previous studies on Spanish stop-lateral clusters which report a period of no vocal tract constriction between the release of the stop and the achievement of target of the lateral (for acoustic studies, see Bradley 2006; Colantoni and Steele 2005; Malmberg 1965; for articulatory studies, see Sotiropoulou et al. 2020; Gibson et al. 2017, Gibson et al. 2019). Furthermore, we observed that the IPI is somewhat longer in voiceless stop-laterals than in voiced stop-laterals albeit the result did not reach significance. Additionally, IPI in stop-lateral sequences increases with prosodic strengthening, echoing the results in Byrd and Choi (2010) on English. Thus, the crucial point for now is that our experimental design successfully elicited different degrees of IPI duration. The demonstration of such variability in phonetic parameters is the crucial prerequisite of our approach, as pointed out in the introduction, which aims to harness this variability (here, in terms of IPI) by assessing how segmental sequences in a global organization adapt or respond to such perturbations in the phonetic parameters of the segments that take part in that organization.

3.2 C1 plateau duration

In this subsection, we examine the effect of C1 voicing and prosodic condition on the plateau duration of the C1 initial stop in stop-lateral clusters. Consonant plateau duration was calculated as the interval between C target and C release. For the CCV sequences, we fitted a linear mixed effects model with C1 plateau duration as dependent variable. The dependent variable was transformed (square root) to better approximate a normal distribution. C1 voicing (voiced, voiceless) and prosodic condition (word boundary, utterance boundary) were modeled as fixed effects. As random effects, we had intercepts for subject and word, as well as by-subject and by-word random slopes for the effects of C1 voicing and prosodic condition. There is an interaction between prosodic condition and C1 voicing (χ 2 [1, N = 1025] = 9.47, p = 0.002). Specifically, C1 plateau is shorter for voiced than for voiceless stops in the word boundary condition (estimate = −1.15, p = 0.02).[2] However, in the utterance boundary condition, C1 plateau does not differ between voiced and voiceless stops (estimate = −0.20, p = non significant). Additionally, C1 plateau is longer in the utterance than in the word boundary both for voiced (estimate = −3.20, p < 0.0001) and voiceless stops (estimate = −2.25, p = 0.0001). Figure 3 plots C1 plateau duration as a function of C1 voicing and prosodic condition.

Figure 3: 
C1 plateau duration as a function of C1 voicing and prosodic condition in CCV. C1 plateau is longer for voiceless than voiced stops in wb (white boxplots) while there is no C1 plateau difference in ut (gray boxplots).
Figure 3:

C1 plateau duration as a function of C1 voicing and prosodic condition in CCV. C1 plateau is longer for voiceless than voiced stops in wb (white boxplots) while there is no C1 plateau difference in ut (gray boxplots).

Summarizing, for stop-lateral clusters, there is an effect of C1 voicing on the duration of the initial stop in the word boundary condition with the C1 voiceless being longer than the C1 voiced. However, in the utterance boundary condition there is no difference in the duration of C1 voiced and voiceless stops. Additionally, C1 plateau duration for both voiced and voiceless stops increases with prosodic strengthening. Such differences in C1 plateau duration as a function of prosodic strengthening in clusters are informative as they provide the prerequisite perturbations under which we can assess whether, as a result of these perturbations, spatiotemporal readjustments in (other parts of) the CCV string occur.

3.3 IPI – C2 lateral compensatory relation

We turn here to examine the relation between IPI and C2 lateral plateau duration in stop-lateral complex onsets across and within prosodic conditions. Our aim is to assess the presence of compensatory effects in the CCV string, that is, effects where spatiotemporal modification in one local region of the string comes systematically with a change in another part of the string. The presence of such effects would indicate that the organization of the different parts of CCV is not independently planned and produced and thus such effects, if present, offer evidence for global organization.

For stop-lateral CCV sequences (N = 1028), there is no correlation between IPI and C2 lateral duration across prosodic conditions when looking at raw values (r(1026) = 0.03, p = 0.22). Figure 4 illustrates the relation between IPI and C2 lateral plateau (raw values) across CCV in both prosodic conditions (wb, ut). This result holds true when looking also within each prosodic condition. Normalized IPI and C2 lateral plateau duration were also calculated to compensate for effects related to inter-speaker variability (cf. Bombien 2011). Raw IPI and C2 lateral plateau measures exhibit substantial variability as can be seen by their ranges in Figure 4. Some of this variability is speaker-specific and derives from simple continuous scaling of IPI as a function of rate (e.g., the longer the CC, the longer the IPI). Hence, it seems useful to also examine a normalized measure of IPI. The IPI was normalized by dividing the raw measure by the total constriction duration of the cluster (i.e., IPI/(C2 release – C1 target)). Consonant plateau duration was calculated as the interval between C target and C release. C plateau duration was normalized by dividing the raw plateau duration by the total duration of the cluster: C plateau normalized = C plateau duration/(C2 release – C1 target). When looking at normalized values, there is still no correlation between IPI and C2 lateral duration across prosodic conditions (r(1026) = −0.30, p < 0.0001). Figure 5 illustrates the relation between normalized IPI and C2 lateral plateau for stop-laterals in both prosodic conditions. Within prosodic condition the pattern looks different. Specifically, in the word boundary condition, there is a moderate negative correlation between normalized IPI and C2 duration (r(506) = −0.55, p < 0.0001): as IPI increases, C2 lateral duration decreases. This relation, however, is not seen in the context of prosodic strengthening, namely, in the utterance boundary there is no correlation between normalized IPI and C2 lateral duration (r(518) = −0.16, p = 0.0001). Figure 6 illustrates the relation between normalized IPI and C2 lateral plateau for stop-laterals in each prosodic condition (wb, ut).

Figure 4: 
Scatterplots showing the relation between C2 lateral duration and IPI for voiced and voiceless stop-lateral CCV across prosodic conditions. There is no correlation between the two variables (r(1026) = 0.03, p = 0.22).
Figure 4:

Scatterplots showing the relation between C2 lateral duration and IPI for voiced and voiceless stop-lateral CCV across prosodic conditions. There is no correlation between the two variables (r(1026) = 0.03, p = 0.22).

Figure 5: 
Scatterplot showing the relation between normalized C2 lateral duration and normalized IPI for voiced and voiceless stop-lateral CCV across two prosodic conditions. There is no correlation between IPI and C2 lateral duration across prosodic conditions (r(1026) = −0.30, p < 0.0001).
Figure 5:

Scatterplot showing the relation between normalized C2 lateral duration and normalized IPI for voiced and voiceless stop-lateral CCV across two prosodic conditions. There is no correlation between IPI and C2 lateral duration across prosodic conditions (r(1026) = −0.30, p < 0.0001).

Figure 6: 
Scatterplots showing the relation between normalized C2 lateral duration and normalized IPI for voiced and voiceless stop-lateral CCV for each prosodic condition. In wb, there is a moderate negative correlation between the two variables (r(508) = −0.55, p < 0.0001) while in ut there is no correlation (r(518) = −0.16, p = 0.0001).
Figure 6:

Scatterplots showing the relation between normalized C2 lateral duration and normalized IPI for voiced and voiceless stop-lateral CCV for each prosodic condition. In wb, there is a moderate negative correlation between the two variables (r(508) = −0.55, p < 0.0001) while in ut there is no correlation (r(518) = −0.16, p = 0.0001).

To sum up, using the raw values of the variables, there is no compensatory relation between IPI and C2 lateral plateau across and also within prosodic conditions. However, the pattern changes when looking at normalized values of the two variables within prosodic condition. In the word boundary (control), there is a negative correlation such that as IPI increases, C2 lateral duration tends to decrease. This relation is not present in the context of prosodic strengthening. Why may this be so? To identify a compensatory relation between two variables, variability along both variables individually is required. From the distribution of the data seen in Figure 6, substantial variability of the lateral duration can be observed (y-axis) only in the word boundary condition. The variability of the lateral duration decreases with prosodic strengthening as can be seen in the shrinking of the y-axis values from wb to ut. Thus, the lack of a compensatory relation in the condition of prosodic strengthening is not surprising.

3.4 Vowel initiation with respect to prevocalic lateral

The c-center organization pattern prescribes that the vowel starts somewhere around the c-center of the prevocalic onset cluster (Browman and Goldstein 1988; Browman and Goldstein 2000; Gafos 2002; Honorof and Browman 1995; Nam and Saltzman 2003). We say ‘somewhere around the c-center’ because the literature is somewhat ambiguous on the matter, depending on whether one interprets any relevant statement to be a statement about observed movement properties versus underlying phonological demands which may or may not have directly observed physical consequences (depending on various parameters). Thus, for example, Browman and Goldstein (1988: 150) write ‘let us make the following assumption: the (temporal) interval from the c-center to the final consonant anchor point is a measure of the activation interval of the vocalic gesture where: (a) the c-center corresponds to a fixed point early in the vocalic activation … ’ but also that ‘We also assume that the actual movement for the vocalic gestures begins at the achievement of target of the first consonant in a possible initial cluster’ (Browman and Goldstein 1988: 150). Honorof and Browman (1995: Figure 1, p. 552) shows the beginning of the vowel activation window to be at the c-center of the prevocalic consonantal cluster (made out of three consonants). Nam and Saltzman (2003) assume a default phasing for the CV relation of 50° and show the V starting somewhat after the c-center of a single consonant. Gafos (2002), in his Optimality Theoretic interpretation, using constraints referring to both spatial and temporal properties of gestures, employs an alignment constraint requiring the V to start at the c-center of the consonant or prevocalic consonant cluster. Again, as noted above, no empirical study has explicitly sought to quantify when the vowel starts in reference to its preceding cluster in any systematic way.

Here, we quantify the vowel start or what we refer to as vowel initiation in stop-lateral CCV and the respective CV sequences in two prosodic conditions: word boundary (wb) and utterance boundary (ut). To do so, the vowel gesture was parsed in each token by using the tongue mid sensor based on the tangential velocity of the signal. The time interval between the so-obtained gestural onset landmark of the vowel and the gestural target landmark of the preceding lateral was then used to define our dependent variable, cv lag.

For the CV ∼ CCV pairs (N = 1891), we fitted a linear mixed effects model with cv lag as a dependent variable. Cluster size (CV, CCV), C1 voicing (voiced, voiceless) and prosodic condition (word boundary, utterance boundary) were used as fixed effects. As random effects, we had random intercepts for subject and word, as well as by-subject random slopes for the effect of prosodic condition and C1 voicing and by-word random slopes for the effect of word nested within C1 voicing. There is no three-way interaction of cluster size, C1 voicing and prosodic condition. However, there is an interaction between cluster size and prosodic condition. The post-hoc analysis showed that in the word boundary condition, the cv lag does not change significantly from CV to CCV (estimate = 5.7 ms) while in the utterance boundary the cv lag changes significantly from CV to CCV (estimate = 33.8 ms, p = 0.0008). Figure 7 illustrates the cv lag across CV ∼ CCV for voiced and voiceless stop-laterals in two prosodic conditions (wb, ut).

Figure 7: 
cv lag in ms between the target of the lateral and vowel initiation in lateral-vowel sequences (CV) and voiced/voiceless stop-lateral sequences (CCV) in two prosodic conditions, word boundary (wb) and utterance boundary (ut). The cv lag remains the same across CV ∼ CCV in wb, but it decreases from CV to CCV in the ut condition.
Figure 7:

cv lag in ms between the target of the lateral and vowel initiation in lateral-vowel sequences (CV) and voiced/voiceless stop-lateral sequences (CCV) in two prosodic conditions, word boundary (wb) and utterance boundary (ut). The cv lag remains the same across CV ∼ CCV in wb, but it decreases from CV to CCV in the ut condition.

With prosodic strengthening from wb to ut, the vowel is found to start earlier with respect to the target of the prevocalic lateral in CCV sequences than in CV (across C1 voicing). Recall the crucial methodological diagnostic in our approach: to uncover evidence for global organization presiding over a sequence of segments (here, CCV), properties of the segments or their relations with one another must somehow be locally varied. The consequences of such variation on the rest of the sequence can then be used to unveil the mode of organization. When local perturbations to segments or relations between adjacent segments have effects that propagate to the rest of the sequence, this is evidence that the organization presiding over that sequence of segments is global. In our case, the ut condition causes the CC part of the CCV sequence to expand by lengthening of the first C and increasing the duration of the IPI between the two consonants (replicating the results on English by Byrd and Choi 2010 and on German by Sotiropoulou and Gafos 2022). As a consequence of that expansion, the rest of the sequence responds so that the vowel starts earlier in the cluster (than what would be the case if the cv lag did not change between wb and ut).

3.5 Stability-based heuristics

Past assessments of syllabic organization make use of the stability of certain intervals (described below) computed across CV ∼ CCV stimuli pairs. For the stability analysis, two intervals were calculated for each stimulus observation. We first describe the right-delimiting landmarks (henceforth, anchors) used in defining these intervals. Two different anchors were used in order to assess the robustness of results: the target of the constriction of the postvocalic consonant (Ctar) and the temporal midpoint of the vowel plateau (Vmid). For each such anchor, we define two intervals, left-delimited by two different landmarks that are found on the consonantism before the vowel, the c-center and the right-edge (as used in several prior studies, e.g., Gafos et al. 2014; Hermes et al. 2015; Shaw and Gafos 2015; Shaw et al. 2009, 2011). The two intervals left-delimited by these two landmarks were the c-center to anchor interval, which stretches from the temporal midpoint of the consonant(s) to the anchor, and the right-edge to anchor interval, stretching from the constriction release of the (immediately) prevocalic consonant and the anchor. We refer to these two intervals as global timing and local timing, respectively; ‘global’ as the first interval is left-delimited by the c-center landmark whose computation implicates all consonants before the vowel as opposed to ‘local’ for the second interval which is left-delimited by the constriction release of just the immediately prevocalic consonant. Next, we evaluate statistically how the duration of the two interval types changes as the number of consonants increases from CV to CCV.

From CV to CCV, we fitted a linear mixed effects model with interval duration as a dependent variable (log transformed to better approximate a normal distribution). Cluster size (CV, CCV), interval type (global, local), prosodic condition (word boundary, utterance boundary) and C1 voicing (voiced, voiceless) were used as fixed effects. As random effects, we had intercepts for subjects and word, as well as by-subject and by-word random slopes for the effect of interval type. The results show no interaction of interval type, cluster size and C1 voicing, but an interaction of interval type, cluster size and prosodic condition (Ctar: (χ 2 [4, N = 2058] = 2008.9, p < 0.0001); Vmid: (χ 2 [4, N = 1892] = 1234.9, p < 0.0001). The post-hoc analysis showed that, in the word boundary, the global timing interval increases significantly from CV to CCV (Ctar estimate = −21.3 ms, p = 0.0004; Vmid estimate = −25.4 ms, p = 0.0002), while the local timing interval does not change significantly from CV to CCV (Ctar: estimate = 9.5 ms; Vmid estimate = 2.6 ms). In the utterance boundary, the global timing interval increases from CV to CCV (Ctar estimate = −22.7 ms, p = 0.0002; Vmid estimate = −23.8 ms, p = 0.0005), while the local timing interval does not change from CV to CCV (Ctar estimate = 7.4 ms; Vmid estimate = 2.9 ms).

Figure 8 plots the duration of the two intervals, global timing and local timing, with Ctar as anchor for the two prosodic conditions, word boundary (wb) and utterance boundary (ut) across speakers, as a function of the number of consonants (CV, CCV).

Figure 8: 
Duration (in ms) of the two intervals, global timing and local timing, for CV (white) and CCV (grey) words in two prosodic conditions. In both word boundary (wb) and utterance boundary (ut), the global timing interval increases from CV to CCV, while the local timing interval remains stable.
Figure 8:

Duration (in ms) of the two intervals, global timing and local timing, for CV (white) and CCV (grey) words in two prosodic conditions. In both word boundary (wb) and utterance boundary (ut), the global timing interval increases from CV to CCV, while the local timing interval remains stable.

Overall, for both voiced and voiceless stop-laterals, the global timing interval changes substantially from CV to CCV while the local timing interval remains stable regardless of prosodic condition.

4 Discussion

We have investigated how word-initial stop-lateral clusters/bl, gl, pl, kl/in Spanish respond to perturbations of phonetic parameters under different prosodic conditions. Word-initial stop-lateral clusters instantiate the global organization because they are prototypical complex onsets in Spanish and other Romance and Germanic languages. In seeking phonetic expressions of this global organization, instead of promoting a single index of such an organization (i.e., as in stability-based indices), we have studied the consequences of perturbations of phonetic parameters, such as IPI or C1 lengthening, throughout the segmental sequence over which the presumed syllabic organization presides. What we learn is that there is no privileged index reflecting syllabic organization: following past heuristics, one expects global timing stability for these clusters whereas the data indicate local timing stability. Instead, global organization in Spanish stop-lateral clusters emerges when looking at relational properties, e.g., the IPI-C2 lateral relation or vowel initiation relative to the prevocalic lateral’s target, in different prosodic conditions. In this section, we put together the effects pointing to this conclusion and develop the implications of the results for broader topic of the relation between phonological organization and phonetics.

4.1 Perturbations of phonetic parameters under prosodic strengthening

Our study begins by documenting a set of basic results concerning the phonetic parameters of C1 stop duration and IPI in the initial part of the CCV syllable. Demonstrating that these parameters vary systematically as a function of prosodic condition is prerequisite, in our approach, to the study of the consequences this variability has on the rest of the segmental sequence. Specifically, our results show that C1 stop plateau duration increases (by 45 ms) in the context of prosodic strengthening, utterance boundary (ut) condition compared to the control condition (word boundary or wb). Additionally, C1 voiced stops are shorter, in terms of plateau duration, than voiceless stops only in the control condition; in the context of prosodic strengthening, there is no difference in stop duration. Whether stop duration varies as a function of voicing or not is not relevant to our concerns here. What is crucial is that C1 stops lengthen as a function of prosodic condition which enables us to examine effects of such lengthening on the organization of the clusters. Apart from C1 plateau duration, IPI is subject to lengthening under prosodic strengthening in our Spanish data. Specifically, our results show that IPI in stop-laterals increases slightly (by 9.4 ms) in the context of the prosodic strengthening compared to the word boundary condition. Furthermore, across prosodic conditions, IPI in voiceless stop-laterals is slightly longer (by 8 ms albeit not significant) than in voiced stop-laterals.

Summarizing, our experimental design successfully elicited different degrees of C1 stop duration and IPI. The demonstration of such variability in phonetic parameters is the crucial prerequisite of our approach, as pointed out in the introduction, which aims to harness this variability by assessing how segmental sequences in a global organization adapt or respond to such perturbations in the phonetic parameters of the segments that take part in that organization.

4.2 IPI – C2 lateral duration relation

Word-initial stop-lateral CCV sequences in the control or word boundary (wb) condition exhibit a compensatory relation such that as IPI increases C2 lateral duration decreases. This compensatory relation is an indication of global organization. Specifically, when the lag between the plateaus of the two consonants, what we call IPI, in a CCV increases (a perturbation which would push the first consonant farther away from its tautosyllabic vowel), shortening of the second consonant compensates by bringing the vowel to overlap more with its tautosyllabic cluster. The CCV sequence thus seems to be organized globally: if each segment in a CCV were planned independently of the other segments, then an increase or decrease in the duration of that segment or inter-segmental interval is not predicted to result in a decrease or increase in the duration of the other. If, instead, the segments are planned as a group (globally), such compensatory relations are expected.

As we have seen, the two variables that enter into the compensatory relation, IPI and C2 lateral duration, individually show sufficient variability in the wb condition. However, in the utterance boundary (ut) condition, the effects of prosodic strengthening on the duration of the first C in CCV as well as on the duration of the IPI between the two consonants freeze the extent of variability in these two parameters, thus precluding the manifestation of a compensatory relation between the two (see Figure 6, right panel). Emphatically, however, this does not mean that there are no indications for global organization in the ut condition. To the contrary, we do find strong indications of global organization in that condition, to be addressed next in Section 4.3.

4.3 Vowel initiation

The beginning of the vowel activation window in relation to any prevocalic consonants has played a major role in theorizing about syllabic organization (Browman and Goldstein 1988), but it has not been quantified in past work on syllabic organization with consonant clusters with the exception of the studies by Sotiropoulou et al. (2020) and Sotiropoulou and Gafos (2022) which have inspired the present work. We summarize here how our results on vowel initiation reveal effects of global organization in stop-lateral clusters by comparing their spatiotemporal coordination with their following vowel in two prosodic conditions.

Our dependent measure for vowel initiation is cv lag, which quantifies the interval between the target of the prevocalic lateral and the onset of the vowel movement. cv lag does not change between CV and CCV in the control condition (word boundary or wb); i.e., across CV and CCV, the timing between the target of the prevocalic lateral and the onset of the vowel remains the same in the wb condition. However, in the ut condition (Primero vi a Ana. ____ era su respuesta; ‘First I saw Anna. ___ was her response’), the vowel starts earlier relative to the target of the prevocalic lateral in CCV as seen by a reduction of the cv lag compared to CV. In other words, the CV substring in CCV is compressed in the ut condition. More specifically, the first C and the IPI in CCV lengthen due to prosodic strengthening. When these effects take place, what we observe is that the rest of the string (the inner CV or /la/ in /pla/) shortens.

Once again, when we look at CV, CCV sequences statically, that is, just at the wb condition, Spanish does not offer much evidence for the expected increased overlap between the vowel and its preceding consonantism in CCV. But as soon as we introduce perturbations to these sequences, by placing them in different prosodic conditions, that is, comparing wb to the ut condition, then evidence for this increased overlap between the vowel and the CCV emerges (at the same time as prior heuristics of syllabic organization, such as global versus local timing stability, show no difference across the two conditions; see Section 3.5).

4.4 Stability-based heuristics

Stop-lateral sequences in Spanish are prototypical syllable onsets. As such and given prior heuristics on diagnosing complex syllable onsets, they are expected to show global (as opposed to local) timing stability. Yet, our results show local timing stability across CV∼CCV for all clusters in Spanish both in the word boundary and in the utterance boundary condition, where prosodic strengthening takes place. We have replicated, in other words, with a twist to which we turn immediately below, results obtained in prior work on Spanish (Sotiropoulou et al. 2020) as well as other languages (e.g., for Moroccan Arabic, see Shaw et al. 2009, Shaw et al. 2011; for German, see Brunner et al. 2014; Pouplier 2012; Sotiropoulou and Gafos 2022; Marin 2013 for Romanian; Tilsen et al. 2012 for a number of languages), pointing to the unreliability of stability-based heuristics in diagnosing syllabic organization.

This effect that local timing stability is maintained across prosodic conditions for stop-laterals in our Spanish data partly contrasts with recent results from German where stability patterns were also assessed across two prosodic conditions (Sotiropoulou and Gafos 2022), just as we do here for Spanish. Specifically, in German, Sotiropoulou and Gafos (2022) report that the local timing stability observed for word-initial stop-laterals in their word boundary condition worsens as the initial part of the CCV string expands (by lengthening the initial stop and the lag between the two consonants) in their utterance boundary condition. In our present study, instead, local timing stability is maintained across both prosodic conditions, that is, there is no worsening of local timing stability when perturbations in the CCV string are introduced in the utterance boundary condition. Given these results, it is worth addressing why the stability patterns for stop-laterals in Spanish remain the same across prosodic conditions whereas they change in German.

An answer to this question can be obtained by considering how the different degrees of C1 and IPI lengthening in German and Spanish lead to different stability patterns. To compare C1 and IPI lengthening effects concretely in the two languages, we will use the stimuli with voiceless stop-lateral clusters (but as pointed out below the same observations generalize to voiced stop-laterals as well). In the current dataset, in the word boundary condition, IPI in voiceless stop-laterals in Spanish is the same as in voiceless stop-laterals in German in Sotiropoulou and Gafos (2022) and C1 stop duration is also very similar (53.5 ms in German, 45.7 ms in Spanish). However, IPI in German undergoes substantially greater lengthening than in Spanish (25.6 vs. 6 ms) in the context of prosodic strengthening (utterance boundary condition). Additionally, C1 lengthening in the same context is also larger in German than in Spanish (58.8 vs. 44 ms). The same pattern of effects is valid for voiced stop-laterals, namely, C1 and IPI lengthening is larger in German than in Spanish. Overall, the degree of lengthening as a result of prosodic strengthening is larger in German than in Spanish stop-laterals. Figure 9 illustrates these effects in the two languages by schematizing the gestural unfolding of lateral-vowel and stop-lateral-vowel pairs across the two prosodic conditions (word boundary or wb versus utterance boundary or UT condition). Specifically, as Figure 9 shows, in both Spanish and German, C1 plateau duration, indicated by the extent of the bold black lines, is longer in ut than in wb, but this lengthening is greater in German than in Spanish. Similarly, IPI, as indicated by the blue arrows, is longer in ut than in wb for both Spanish and German, and this lengthening is greater in German than in Spanish. In terms of intervals, the local timing interval is perturbed more in German than in Spanish (compare, within each language, the differences in the red arrow lengths across the wb and ut conditions).[3]

Figure 9: 
Temporal configurations of a singleton lateral /l/ and a stop-/l/ cluster in two prosodic conditions, word boundary (wb) and utterance boundary (ut), in Spanish (left, present study) and German (right, drawing from the results of Sotiropoulou and Gafos 2022). The trapezoids represent consonant gestures and the arc intersecting the trapezoid of the lateral represents the initial part (or to-phase) of the vowel gesture. The bold black lines, as part of the C1 stop gesture, represent the constriction plateau of the stop which lengthens more from wb to ut in German than in Spanish. The blue arrows represent the interplateau interval (IPI) which lengthens more from wb to ut in German than in Spanish. The red arrows show the local timing interval which remains stable from CV to CCV both in wb and ut in Spanish, but shortens from CV to CCV in the context of prosodic strengthening (ut) in German.
Figure 9:

Temporal configurations of a singleton lateral /l/ and a stop-/l/ cluster in two prosodic conditions, word boundary (wb) and utterance boundary (ut), in Spanish (left, present study) and German (right, drawing from the results of Sotiropoulou and Gafos 2022). The trapezoids represent consonant gestures and the arc intersecting the trapezoid of the lateral represents the initial part (or to-phase) of the vowel gesture. The bold black lines, as part of the C1 stop gesture, represent the constriction plateau of the stop which lengthens more from wb to ut in German than in Spanish. The blue arrows represent the interplateau interval (IPI) which lengthens more from wb to ut in German than in Spanish. The red arrows show the local timing interval which remains stable from CV to CCV both in wb and ut in Spanish, but shortens from CV to CCV in the context of prosodic strengthening (ut) in German.

Conceivably, the differences between Spanish and German discussed here and illustrated in Figure 9 may be related to rhythmic typological distinction between syllable-timed (as in Spanish) versus stress-timed (as in German) languages (Abercrombie 1964, 1967; Pike 1945; Ramus et al. 1999). Consider the consequences on vowel duration of adding consonants to a syllable where the vowel is part of. In conformity with the tendency for segments to shorten when combined with more segments in the same syllable (Farnetani and Kori 1986; Fowler 1981; Haggard 1973; Katz 2012; Lindblom and Rapp 1973; Maddieson 1985; Marin and Pouplier 2010; Munhall et al. 1992; van Santen 1992; Waals 1999), as the number of segments increases from CV to CCV, vowel duration should decrease. This is what is robustly seen in German (Brunner et al. 2014) and English, both in acoustic and articulatory studies (Katz 2012; Marin and Pouplier 2010), but not in Spanish. For example, several studies on Spanish report no acoustic vowel duration reduction in CCV compared to CV (Cuenca 1996) and CVC compared to CV (Cuenca 1996; Dauer 1983). More recently, Aldrich and Simonet (2019) do find a rather minor vowel duration reduction in CCV compared to CV, but the extent of this reduction is not anywhere close to the lengthening of the syllable duration due to the addition of the extra consonant in CCV (Spanish, being syllable-timed, one would expect a uniform syllable duration across CV and CCV).[4] The local timing interval in our study can be considered a proxy for the acoustic duration of the vowel, because this interval, which is delineated by articulatory landmarks, extends from the release of the prevocalic consonant to some time point late in the syllable (e.g., the target of the postvocalic consonant). In fact, the shortening of the local timing interval in our study (8 ms, averaging across the two prosodic conditions in the results of Section 3.5) and the vowel reduction (5 ms) reported in Aldrich and Simonet (2019) are of the same magnitude. It is thus evident both in our study as well as in the long tradition of studies examining aspects of speech tempo along the lines of the rhythmic typology distinctions that vowels do not shorten in Spanish whereas they do in languages with stress-timed prosody such as German. We are now in a position to explain the cluster of effects in Spanish versus German starting from this distinction about the (in)flexibility of vowel shortening across languages. Because vowels cannot shorten in Spanish, this predicts that when syllables expand in the context of prosodic strengthening (ut), the lengthening of the prevocalic material must be limited. In contrast, in German where the vowel can shorten (corresponding to the substantial reduction in the duration of the local timing interval from CV to CCV as shown in Figure 9), lengthening of C1 and IPI in the context of prosodic strengthening (ut) can be appreciably larger than in Spanish. If this reasoning is correct, it predicts that the patterning of the effects contrasted above in Spanish versus German (Spanish: little to no change in the local timing interval and less IPI, C1 lengthening under prosodic strengthening; German: substantial shortening of the local timing interval and larger IPI, C1 lengthening under prosodic strengthening) involves covarying rather than independently distributed properties; we predict, in other words, that there should be no languages where the local timing interval remains stable (as in Spanish) but IPI, C1 change to an extent comparable to that seen in German and conversely no languages where the local timing interval shortens appreciably (as in German) but IPI, C1 show only limited change (as in Spanish). In sum, our interpretation of the differences reviewed here between Spanish and German makes falsifiable predictions which we hope can be pursued in future work on the relation between rhythmic typology and spatiotemporal organization in syllables across different languages.

4.5 Implications

If and how qualitative phonological organization is lawfully linked to the continuous phonetics is a fundamental problem in spoken language. Here, we have pursued a specific instance of this problem in the relation between syllable structure and phonetic indices for that structure in data from Spanish. In drawing implications from our results, we begin with the last finding on the behavior of stability-based heuristics. As we have seen, despite expectations to the contrary, stability-based heuristics provide no evidence for global organization in stop-lateral clusters even in the context of prosodic strengthening. This is unlike the finding by Sotiropoulou and Gafos (2022) on German which used the same experimental design as ours and found that in the case of prosodic strengthening stability patterns began to move in the expected direction (i.e., the stability of the local timing interval worsened). The juxtaposition of these outcomes is particularly informative because it shows that there is no absolute or uniform pattern as to how the stability-based heuristics and eventually other phonetic indices which reflect syllabic organization will respond to local perturbations of phonetic parameters. The way phonetic indices respond to local perturbations depends on the language at hand and on the actual extent of those perturbations (here, in terms of the degree of lengthening of the phonetic parameters induced by prosodic strengthening).

Let us see how the idea of readjustments to local perturbations subsumes both what was thought before to be a standard index of global organization as well as other indices documented in our present work. In a well-known take (Browman and Goldstein 1988), when a consonant is added in front of a CV to form a larger CCV which is organized as one syllable (i.e., global organization), the consonantal material in front of the vowel expands with the vowel compensating for that expansion by overlapping more with its preceding consonant in CCV compared to the CV context. This is the classic c-center diagnostic which has met issues in German (Brunner et al. 2014; Pouplier 2012) and other languages (Gafos et al. 2014; Shaw et al. 2011) where evaluations of its validity have been undertaken. What we wish to point out here is that this diagnostic can be restated in terms of and subsumed under the revised approach pursued in the present paper: as the consonantal portion before the vowel expands (here, due to the addition of a C in front of a CV) in CCV, the lag between the prevocalic C and the vowel decreases to compensate for that expansion. In fully parallel terms, when in our Spanish data, the duration of the first consonant in CCV and the gap between the end of the constriction of the first and the beginning of the second consonant (the so-called IPI) increase, the vowel starts earlier to compensate for this expansion in the CC part of the CCV sequence. Another compensatory relation is found between IPI and C2 lateral duration; as IPI increases, C2 lateral duration decreases to compensate for that increasing IPI. Hence, even though the specifics which enter into these different compensatory relations may change depending on the contexts examined, the hallmark of global organization is entirely general: local perturbations result in readjustments in some other portion of the CCV in globally organized CCV sequences. This reformulation of what it means for sequences of segments to be globally organized bypasses the issues with the original formulation while at the same time elevating the search of phonological organization at the level of compensatory relations among the elements that partake in that organization.

Furthermore, this reformulation establishes connections to so far independent work on models of rhythm as in the work of Barbosa (2002, 2007) and O’Dell and Nieminen (1998, 2002, 2009). For example, O’Dell and Nieminen (2009) consider the fact that CVCVV in Finnish is longer in duration than CVVCV despite the fact that the two have the same number of segments, syllables and moras. In one analysis, the first sequence is assigned a parse of two feet, (CV)(CVV), and the second a parse of one foot, (CVVCV). In O’Dell and Nieminen’s coupled oscillator model of rhythm, two feet correspond to two cycles of an oscillator (at some level in the prosodic hierarchy) for (CV)(CVV) as opposed to one cycle for (CVVCV) and this is one way to get at handle at the durational difference between the two sequences. What is of more interest is that any such hypothesis fleshed out in the formal framework of coupled oscillators makes a host of predictions about durational covariation of durations within versus across units. Thus, the single foot hypothesis for (CVVCV) predicts a close relation between the duration of the first bimoraic and the second monomoraic vowel. This is because if some interval of absolute time (say 600 ms) is allocated to one unit, the single foot (CVVCV), then within that unit, the subunits which are twice the length of some other subunit (the first VV is twice the length of the second V) should surface with durations that closely match this proportionality. The key concept is relational timing. The same unit, here VV, can be allocated any absolute amount of time and then that time is distributed proportionally into absolute times for the subunits. But when comparing subunit durations across different units, as in (CV)(CVV), where the first V is in one foot and the second VV in the other foot, more variation in terms of their durations with respect to one another is predicted. This is because the V, VV subunits are parts of different supraordinate units. The decisions of how much to allocate to each of the different units, (CV) and (CVV), are now separate and hence the link in absolute duration of one subunit under one foot and absolute duration of the subunit under a different foot is not as tight as when the two subunits are part of the same supraordinate unit. Note the parallels with results obtained in so far independent work such as ours in the present paper and prior work focusing more narrowly on (inter-)segmental properties within versus across syllables. For example, in data from German, Sotiropoulou and Gafos (2022) report compensatory effects in durations between subunits of the same syllable but no such effects between subunits of different syllables (e.g., a compensatory relation involving vowel initiation relative to the lateral’s target in ‘Klage’, where the/kl/is in one syllable, versus no such relation in ‘pack Lage’, where the two segments in/kl/belong to different syllables). The same pattern is discernible across languages too. For example, when comparing stop-lateral clusters in Spanish and (Moroccan) Arabic, previous work reports an IPI-C2 lateral compensatory relation in the former but not the latter language (Gafos et al. 2020; Sotiropoulou et al. 2020) in conformity with the well-accepted phonological hypothesis that such clusters are syllable onsets in Spanish but not in (Moroccan) Arabic. Across the domains of facts in O’Dell and Nieminen (2009) and the work discussed here, lawful relations between segment durations are seen in the former (within-unit organized segments) but not the latter case (across-unit segments). In sum, the reformulation of what it means for sequences of segments to be globally organized in terms of compensatory relations appears to be a promising avenue for future work. A proper assessment of this reformulation of course requires much further testing beyond the languages on the basis of which it has been developed (German and Spanish).

5 Conclusions

This paper has addressed the relation between syllabic structure and inter-segmental spatiotemporal coordination using data of word-initial stop-lateral clusters /bl, pl, gl, kl/ from Central Peninsular Spanish. Empirically, the present study extends an earlier study on Spanish by Sotiropoulou et al. (2020) by using two prosodic conditions (word boundary and utterance boundary) as opposed to just one (word boundary), the latter being the usual practice when it comes to investigating the relation between syllabic organization and inter-segmental spatiotemporal coordination. Using two prosodic conditions, we successfully elicited different degrees of lengthening of the initial C1 stop and of the interval between the plateaus of the two consonants in the stop-lateral clusters. Such effects of prosodic variation on segmental properties and on the overlap between the segments, we argue, are crucial to understanding the nature of the link between (phonological) syllabic organization and the spatiotemporal manifestation of that organization.

In the word boundary condition, the results showed local timing stability for all clusters, although stop-lateral clusters are prototypical onsets in Spanish and as such, on the basis of previous theorizing, such clusters should exhibit global timing stability. However, when adopting the approach where evidence for global organization is expressed in terms of how the string of segments (over which the nature of the phonological organization is assessed) responds to perturbations of localized properties (such as duration) within that string, effects of global organization begin to emerge. Specifically, we observed a compensatory relation such that as the lag between the consonants increases, the lateral duration tends to decrease corroborating the same effect in Sotiropoulou et al. (2020). Such a compensatory effect is one indication of global organization that serves to bring the vowel to overlap with the cluster. Specifically, when the lag between the plateaus of the two consonants increases, shortening the lateral compensates by bringing the vowel to overlap with its tautosyllabic cluster. Another compensatory effect emerges when looking at vowel initiation relative to the prevocalic lateral’s target. Specifically, the lag between vowel initiation and the prevocalic lateral’s target decreases in CCV compared to CV when the initial part of the CCV string expands due to prosodic strengthening in the utterance boundary condition, while it does not change between CV and CCV in the word boundary condition. As a result of this expansion in the utterance boundary condition, the inner CV substring in CCV compresses and earlier vowel initiation with respect to the prevocalic lateral is observed: when C1 and IPI lengthen due to prosodic strengthening, the rest of the string (the inner CV) shortens to compensate for that lengthening of the initial part of the CCV string. In other words, earlier vowel initiation in CCV compared to CV in the context of prosodic strengthening is another species of compensatory effect, just like the IPI-C2 lateral duration relation, which serves to indicate the presence of a global organization by bringing the vowel to overlap more with the cluster (than it would otherwise be the case if vowel initiation did not occur earlier as in the word boundary condition). With respect to the stability-based heuristics, we continue to find local timing stability also in the case of prosodic strengthening. That is, the lengthening of the initial consonant along with the lengthening of the interplateau interval did not suffice to perturb the stability of the local timing interval. This result suggests that the stability-based heuristics remain unreliable all along even in the case of prosodic strengthening in Spanish. This is so because the way phonetic indices respond to perturbations depends on the language at hand and on the actual extent of those perturbations (here, in terms of the degree of lengthening of the phonetic parameters induced by prosodic strengthening).

To conclude, our results indicate that syllabic organization is expressed in terms of compensatory relations among phonetic parameters rather than via a privileged metric such as c-center stability or any other such single measure as has been assumed in prior work. Uniformity of phonological organization (that is, the statement that for instance both German and Spanish assign the same organization to a /gla/ sequence) does not imply uniqueness or universality of phonetic exponents of that organization across the two languages. What uniformity does imply is that, in a /gla/ sequence, a local change to some part of the sequence produces (language-particular) compensatory effects propagating to other parts of the sequence. It is the presence of these readjustment effects among different parts of the whole that reveal its global organization. Variability, introduced in our study via prosodic modulations, is absolutely crucial in demonstrating this result because it is only when the sequence of segments that partake in the presumed organization is perturbed (i.e., varied) that these compensatory effects can be observed.


Corresponding author: Stavroula Sotiropoulou, University of Potsdam, Potsdam, Germany, E-mail:

Funding source: Deutsche Forschungsgemeinschaft (DFG, German Research Foundation)

Award Identifier / Grant number: Project ID 317633480 - SFB 1287

  1. Competing interests: The authors declare that they have no competing interests.

  2. Research funding: This work has been funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project ID 317633480 – SFB 1287.

References

Abercrombie, David. 1964. A phonetician’s view of verse structure. Linguistics 2. 5–13. https://doi.org/10.1515/ling.1964.2.6.5.Suche in Google Scholar

Abercrombie, David. 1967. Elements of general phonetics. Edinburgh: Edinburgh University Press.Suche in Google Scholar

Aldrich, Alexander C. & Miquel Simonet. 2019. Duration of syllable nuclei in Spanish. Studies in Hispanic and Lusophone Linguistics 12(2). 247–280. https://doi.org/10.1515/shll-2019-2012.Suche in Google Scholar

Barbosa, Plínio A. 2002. Explaining cross-linguistic rhythmic variability via a coupled-oscillator model of rhythm production. In Proceedings of Speech Prosody 2002, 163–166.10.21437/SpeechProsody.2002-26Suche in Google Scholar

Barbosa, Plínio A. 2007. From syntax to acoustic duration: A dynamical model of speech rhythm production. Speech Communication 49. 725–742. https://doi.org/10.1016/j.specom.2007.04.013.Suche in Google Scholar

Bates, Douglas, Martin Maechler, Bolker Ben & Steve Walker. 2015. Fitting linear mixed-effects models using lme4. Journal of Statistical Software 67(1). 1–48. https://doi.org/10.18637/jss.v067.i01.Suche in Google Scholar

Bombien, Lasse. 2011. Segmental and prosodic aspects in the production of consonant clusters. Munich: Ludwig-Maximilians-Universität dissertation.Suche in Google Scholar

Bradley, Travis. 2006. Spanish complex onsets and the phonetics-phonology interface. In Fernando Martinez Gil & Sonia Colina (eds.), Optimality-Theoretic studies in Spanish phonology, 15–38. Amsterdam: John Benjamins.10.1075/la.99.02braSuche in Google Scholar

Browman, Catherine P. & Louis Goldstein. 1988. Some notes on syllable structure in articulatory phonology. Phonetica 45(2–4). 140–155. https://doi.org/10.1159/000261823.Suche in Google Scholar

Browman, Catherine P. & Louis Goldstein. 2000. Competing constraints on intergestural coordination and self-organization of phonological structures. Bulletin de la Communication Parlée 5. 25–34.Suche in Google Scholar

Brunner, Jana, Christian Geng, Stavroula Sotiropoulou & Adamantios Gafos. 2014. Timing of German onset and word boundary clusters. Laboratory Phonology 5(4). 403–454. https://doi.org/10.1515/lp-2014-0014.Suche in Google Scholar

Byrd, Dani. 1995. C-Centers revisited. Phonetica 52. 263–282. https://doi.org/10.1159/000262183.Suche in Google Scholar

Byrd, Dani & Susie Choi. 2010. At the juncture of prosody, phonology, and phonetics–The interaction of phrasal and syllable structure in shaping the timing of consonant gestures. In Cécile Fougeron, Barbara Kühnert, Mariapaola d’Imperio & Nathalie Vallée (eds.), Papers in Laboratory Phonology 10. Berlin: Mouton de Gruyter.10.1515/9783110224917.1.31Suche in Google Scholar

Colantoni, Laura & Jeffrey Steele. 2005. Phonetically-driven epenthesis asymmetries in French and Spanish obstruent-liquid clusters. In Randall Gess & Edward J. Rubin (eds.), Experimental and theoretical approaches to Romance linguistics, 77–96. Amsterdam: John Benjamins.10.1075/cilt.272.06colSuche in Google Scholar

Cuenca, Mary H. 1996. Análisis instrumental de la duración de las vocales en español. Philologia Hispalensis 11. 295–307. https://doi.org/10.12795/ph.19961997.v11.i01.20.Suche in Google Scholar

Dauer, Rebecca M. 1983. Stress-timing and syllable-timing reanalyzed. Journal of Phonetics 11. 51–62. https://doi.org/10.1016/s0095-4470(19)30776-4.Suche in Google Scholar

Durvasula, Karthik, Mohammed Qasem Ruthan, Sarah Heidenreich & Yen-Hwei Lin. 2021. Probing syllable structure through acoustic measurements: Case studies on American English and Jazani Arabic. Phonology 38. 173–202. https://doi.org/10.1017/s0952675721000142.Suche in Google Scholar

Farnetani, Edda & Shiro Kori. 1986. Effects of syllable and word structure on segmental durations in spoken Italian. Speech Communication 5. 17–34. https://doi.org/10.1016/0167-6393(86)90027-0.Suche in Google Scholar

Fowler, Carol A. 1981. A relationship between coarticulation and compensatory shortening. Phonetica 38. 35–50. https://doi.org/10.1159/000260013.Suche in Google Scholar

Gafos, Adamantios. 2002. A grammar of gestural coordination. Natural Language & Linguistic Theory 20(2). 269–337. https://doi.org/10.1023/a:1014942312445.10.1023/A:1014942312445Suche in Google Scholar

Gafos, Adamantios, Simon, Charlow, Jason Shaw & Philip Hoole. 2014. Stochastic time analysis of syllable referential intervals and simplex onsets. Journal of Phonetics 44. 152–166. https://doi.org/10.1016/j.wocn.2013.11.007.Suche in Google Scholar

Gafos, Adamantios, Jens Roeser, Stavroula Sotiropoulou, Philip Hoole & Chakir Zeroual. 2020. Structure in mind, structure in vocal tract. Natural Language & Linguistic Theory 38. 43–75. https://doi.org/10.1007/s11049-019-09445-y.Suche in Google Scholar

Gibson, Mark, Stavroula Sotiropoulou, Stephen Tobin & Adamantios, Gafos. 2017. On some temporal properties of Spanish consonant-liquid and consonant-rhotic clusters. In Malte Belz, Susanne Fuchs, Stefanie Jannedy, Christine Mooshammer, Oksana Rasskazova & Marzena Zygis (eds.), Proceedings of the 13th Tagung Phonetik und Phonologie im deutschsprachigen Raum (PP13), 73–76. Berlin: Humboldt Universität zu Berlin.Suche in Google Scholar

Gibson, Mark, Stavroula Sotiropoulou, Stephen Tobin & Adamantios Gafos. 2019. Temporal aspects of word initial single consonants and consonants in clusters in Spanish. Phonetica 76(6). 448–478. https://doi.org/10.1159/000501508.Suche in Google Scholar

Goldstein, Louis, Ioana Chitoran & Elisabeth Selkirk. 2007. Syllable structure as coupled oscillator modes: Evidence from Georgian versus Tashlhiyt Berber. In Jürgen Trouvain & William J. Barry (eds.), Proceedings of the sixteenth International Congress of Phonetic Sciences (ICPhS), 241–244. Saarbrücken: Univ. des Saarlandes.Suche in Google Scholar

Gracco, Vincent L. & James H. Abbs. 1988. Central patterning of speech movements. Experimental Brain Research 71. 515–526. https://doi.org/10.1007/bf00248744.Suche in Google Scholar

Haggard, Mark. 1973. Abbreviation of consonants in English pre-and post-vocalic clusters. Journal of Phonetics 1. 9–24. https://doi.org/10.1016/s0095-4470(19)31378-6.Suche in Google Scholar

Hermes, Anne, Bastian Auris & Doris Mücke. 2015. Computational modelling for syllabification patterns in Tashlhiyt Berber and Maltese. In Susanne Fuchs, Martine Grice, Anne Hermes, Leonardo Lancia & Doris Mücke (eds.), Proceedings of the 10th International Seminar on Speech Production (ISSP), 186–189. Cologne: Universität zu Köln.Suche in Google Scholar

Hermes, Anne, Doris Mücke & Bastian Auris. 2017. The variability of syllable patterns in Tashlhiyt Berber and Polish. Journal of Phonetics 64. 127–144. https://doi.org/10.1016/j.wocn.2017.05.004.Suche in Google Scholar

Hermes, Anne, Doris Mücke & Martine Grice. 2013. Gestural coordination of Italian word-initial clusters: The case of ‘impure s’. Phonology 30. 1–25. https://doi.org/10.1017/s095267571300002x.Suche in Google Scholar

Honorof, Douglas & Catherine P. Browman. 1995. The centre or edge: How are consonant clusters organised with respect to the vowel? In Kjell Elenius & Peter Branderud (eds.), Proceedings of the 13th International Congress of the Phonetic Sciences (ICPhS), 552–555. Stockholm: Stockholm University.Suche in Google Scholar

Hothorn, Torsten, Frank Bretz & Peter Westfall. 2008. Simultaneous inference in general parametric models. Biometrical Journal 50(3). 346–363. https://doi.org/10.1002/bimj.200810425.Suche in Google Scholar

Katz, Jonah. 2012. Compression effects in English. Journal of Phonetics 40. 390–402. https://doi.org/10.1016/j.wocn.2012.02.004.Suche in Google Scholar

Lenth, Russell & Maxime Hervé. 2015. Package ‘lsmeans’. Available at: cran.r-project.org/web/packages/lsmeans/lsmeans.pdf.Suche in Google Scholar

Lindblom, Björn & Karin Rapp. 1973. Some temporal regularities of spoken Swedish. Papers from the Institute of Linguistics: University of Stockholm 21. 1–59.Suche in Google Scholar

Maddieson, Ian. 1985. Phonetic cues to syllabification. In Victoria Fromkin (ed.), Phonetic linguistics: Essays in honor of Peter Ladefoged, 203–221. New York, NY: Academic Press.Suche in Google Scholar

Malmberg, Bertil. 1965. Estudios de fonética hispánica (Collectanea Phonetica, I.). Madrid: Consejo Superior de Investigaciones Científicas.Suche in Google Scholar

Marin, Stefania. 2013. The temporal organization of complex onsets and codas in Romanian: A gestural approach. Journal of Phonetics 41. 211–227. https://doi.org/10.1016/j.wocn.2013.02.001.Suche in Google Scholar

Marin, Stefania & Marianne Pouplier. 2010. Temporal organization of complex onsets and codas in American English: Testing the predictions of a gestural coupling model. Motor Control 14(3). 380–407. https://doi.org/10.1123/mcj.14.3.380.Suche in Google Scholar

Munhall, Kevin, Carol Fowler, Sarah Hawkins & Elliot Saltzman. 1992. Compensatory shortening in monosyllables of spoken English. Journal of Phonetics 20. 225–239. https://doi.org/10.1016/s0095-4470(19)30624-2.Suche in Google Scholar

Nam, Hosung & Elliot Saltzman. 2003. A competitive, coupled oscillator model of syllable structure. In Maria-Josep Solé, Daniel Recasens & Joaquin Romero (eds.), Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS), 2253–2256. Barcelona: Universitat Autònoma de Barcelona.Suche in Google Scholar

O’Dell, Michael L. & Tommi Nieminen. 1998. Reasons for an underlying unity in rhythm dichotomy. Linguistica Uralica 3. 178–185. https://doi.org/10.3176/lu.1998.3.04.Suche in Google Scholar

O’Dell, Michael L. & Tommi Nieminen. 2002. How long is a stress group? Cadernos de Estudos Lingüísticos 43. 93–108. https://doi.org/10.20396/cel.v43i0.8637151.Suche in Google Scholar

O’Dell, Michael L. & Tommi Nieminen. 2009. Coupled oscillator model for speech timing: Overview and examples. In Martti Vainio, Reijo Aulanko & Olli Aaltonen (eds.), Nordic Prosody: Proceedings of the 10th conference, Helsinki, Finland, 179–190. Frankfurt: Peter Lang.Suche in Google Scholar

Pike, Kenneth. 1945. The intonation of American English. Ann Arbor, Michigan: University of Michigan Press.Suche in Google Scholar

Pouplier, Marianne. 2012. The gestural approach to syllable structure. In Susanne Fuchs, Melanie Weirich, Daniel Pape & Pascal Perrier (eds.), Universal, language and cluster-specific aspects. Speech planning and dynamics, 63–96. Frankfurt am Main: Peter Lang AG.Suche in Google Scholar

Pouplier, Marianne & Stefan Benus. 2011. On the phonetic status of syllabic consonants: Evidence from Slovak. Laboratory Phonology 2(2). 243–273. https://doi.org/10.1515/labphon.2011.009.Suche in Google Scholar

Ramus, Franck, Marina Nespor & Jacques Mehler. 1999. Correlates of linguistic rhythm in the speech signal. Cognition 73. 265–292. https://doi.org/10.1016/s0010-0277(99)00058-x.Suche in Google Scholar

RStudio Team. 2015. RStudio: Integrated development for R. Boston, MA: RStudio, Inc. Available at: http://www.rstudio.com/.Suche in Google Scholar

Schöner, Gregor. 2002. Timing, clocks, and dynamical systems. Brain & Cognition 48. 31–51. https://doi.org/10.1006/brcg.2001.1302.Suche in Google Scholar

Shaw, Jason & Adamantios Gafos. 2015. Stochastic time models of syllable structure. PLoS One 10(5). https://doi.org/10.1371/journal.pone.0124714.Suche in Google Scholar

Shaw, Jason, Adamantios Gafos, Philip Hoole & Chakir Zeroual. 2009. Syllabification in Moroccan Arabic: Evidence from patterns of temporal stability. Phonology 26(1). 187–215. https://doi.org/10.1017/s0952675709001754.Suche in Google Scholar

Shaw, Jason, Adamantios Gafos, Philip Hoole & Chakir Zeroual. 2011. Dynamic invariance in the phonetic expression of syllable structure. Phonology 28. 455–490. https://doi.org/10.1017/s0952675711000224.Suche in Google Scholar

Sotiropoulou, Stavroula & Adamantios Gafos. 2022. Syllabic organization and phonetic indices in German stop-lateral clusters. Laboratory Phonology 13(1). 1–43. https://doi.org/10.16995/labphon.6440.Suche in Google Scholar

Sotiropoulou, Stavroula, Mark Gibson & Adamantios Gafos. 2020. Global organization in Spanish onsets. Journal of Phonetics 82. https://doi.org/10.1016/j.wocn.2020.100995.Suche in Google Scholar

Tilsen, Sam, Draga Zec, Christina Bjorndahl, Becky Butler, Marie-Josee L’Esperance, Alison Fisher, Linda Heimisdottir, Margaret Renwick & Chelsea Sanker. 2012. A cross-linguistic investigation of articulatory coordination in word-initial consonant clusters. Cornell Working Papers in Phonetics and Phonology 2012. 51–81.Suche in Google Scholar

Van Santen, Jan P. H. 1992. Contextual effects on vowel duration. Speech Communication 11. 513–546. https://doi.org/10.1016/0167-6393(92)90027-5.Suche in Google Scholar

Waals, Juliette. 1999. An experimental view of the Dutch syllable. Utrecht, Netherlands: UiL OTS/Utrecht University dissertation.Suche in Google Scholar

Published Online: 2024-09-06
Published in Print: 2024-09-25

© 2024 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Heruntergeladen am 23.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/shll-2024-2014/html
Button zum nach oben scrollen