Home ‘Disfluency’ features in bilingual speech: meaning and methodology
Article
Licensed
Unlicensed Requires Authentication

‘Disfluency’ features in bilingual speech: meaning and methodology

  • Rachel Varra EMAIL logo
Published/Copyright: April 30, 2025

Abstract

This study provides a concise review of the often under-scrutinized interpretation of ‘disfluency’ features – fillers, pauses, false starts – in the speech of bilingual individuals, here collectively called ‘flags’. It urges that the interpretations attributed to these asemantic and multifunctional speech elements be empirically justified and demonstrates a means by which to justify (or not) meanings commonly attributed to flags. This study is uncommon among those that explore flagging/disfluency behaviors in that the number of individuals whose speech serves as the source of data is large, coming from a corpus of 113 sociolinguistic interviews with Spanish-English bilinguals in New York City. In particular, this study explores one interpretation of these elements: that flags demonstrate a speaker’s awareness of the origin or ‘belongingness’ of a nearby English-origin string (which is here called the ‘language membership’ assumption). It formulates this interpretation as a set of hypotheses and describes the sort of data from the corpus that would support these hypotheses. That is, it explores whether there is evidence to support the idea (a) that speakers use flagging elements to indicate their awareness of and point out the ‘other-language’ elements in their discourse and (b) that what is foreign-origin or ‘other language’ is also perceived as not being integrated into the Spanish lexicon. Results indicate that there is evidence in this corpus and among these bilinguals to support both versions of the hypothesis. Implications and limitations are discussed briefly.


Corresponding author: Rachel Varra, College of William & Mary, Williamsburg, Virginia, USA, E-mail:

Appendix A: Sociodemographic characteristics of the sample

Ethnonational affiliation was determined by (a) place of birth if an informant was born outside of the U.S. or (b) if the informant was U.S.-born, self-identification with one of six Spanish-speaking nations.

n
Colombia 18
Ecuador 19
Mexico 21
Dominican Republic 18
Puerto Rico 21
Cuba 16

Region refers to whether an informant’s ethnonational affiliation is based in the Caribbean (n = 58) or the Latin American mainland (n = 55).

Areal is whether the informant comes from an inland or coastal region.

n
Lowlands or coast 67
Highlands or interior 46

Sex means biological sex.

n
Male 57
Female 56

Age is the age of the informant at the time of the interview.

n
Teens (from 13-19 inclusive) 16
Young adults (from 20-39 inclusive) 75
Middle-aged adults (from 40-59 inclusive) 20
60 plus-ers (age 60 and above) 2

Age of arrival is the age at which the informant came to the United States. Other groupings of this variable were looked at, but this division resulted in the most significant correlations with lexical transfer behavior.

n
U.S.-born

U.S.-born or arrive ≤ age 3
28
Child arrivals arrived from ages 4 to 12 20
Teenager arrived from age 13–19 24
Older arrivals arrived ≥ age 20 41

Years in the U.S. refers to the amount of time the informant had spent in the United States up to the time of the interview.

n
15 years or less Recent (0–2 years in U.S.) [14]

Long (3–15 years in U.S. [39]
53
Over 15 years + Native

16 years + [22]

Native (any 5 of first 8 years of education in U.S.) [38]
60

Social Class means self-ascribed social class.

n
Middle (+High) 70
Working 39

Unknown

4

Socioeconomic status (SES) was a classification of informants based on the more objective criteria of occupation and income: level A, the highest rating (n = 1), level B (n = 34), level C (n = 53), level D (n = 22) and unknown (n = 3).

Level of education was determined by the highest level of education attended, though not necessarily completed.

n
Elementary school 4
High school 34
College 54
Graduate 21

English skills is a qualitative self-assessment of an informant’s ability in English. Informants rated their English according to a four-point descriptive scale, later condensed into two.

n
Non-excellent

Poor [1]

Passable [32]

Good [34]
67
Excellent 46

Spanish skills is a qualitative self-assessment of an informant’s ability in Spanish using the same four-point descriptive scale.

n
Non-excellent

Poor [2]

Passable [14]

‘Good’ [46]
62
Excellent 50
Unrated 1

Language dominance is a condensation of the Spanish skills minus English skills composite.

n
Balanced 32
Spanish dominant 35
English dominant 45

Spanish use in general

n
None 13
Low 43
Mid 40
High 17

Appendix B: Lexical characteristics of the data

Table 8:

Phrasal category of single-word English-origin strings.

common

nouns
proper

nounsa
adj. disc. marker verbs adverbs prep. conj & compl other totala
Count 1,363 191 121 549 70 33 5 66 1 2,399
Percent (%) 56.8 7.9 5.0 22.9 2.9 1.4 0.02 2.8 <0.1 100
  1. aThe total of lexemes in single-word strings and multi-word strings does not add up to the total number of strings in the corpus as displayed in Table 1 (n = 4,178) because the part of speech or length of utterance of 10 strings could not be determined.

Table 9:

Phrasal category multi-word English-origin strings.

common

nouns
proper

nouns
adj. disc. marker verbs adverbs prep. conj & compl other totala
Count 427 720 40 509 21 12 11 20 9 1769
Percent (%) 24.1 40.7 2.3 28.8 1.2 0.7 0.6 1.1 0.5 100
  1. aThe total of lexemes in single-word strings and multi-word strings does not add up to the total number of strings in the corpus as displayed in Table 1 (n = 4,178) because the part of speech or length of utterance of 10 strings could not be determined.

Table 10:

Part of speech of lexemes in the sample as a whole.

common

nouns
proper

nouns
adj. disc. marker verbs adverbs prep. conj & compl othera total
Count 779 452 360 69 53 52 17 6 20 1808
Percent (%) 43.1 25.0 19.9 3.8 2.9 2.9 0.9 0.3 1.1 100
  1. aLexemes part of the ‘other’ category include determiners and quantifiers, verb particles (eso es lo que traen to428P), non-clausal set phrases, ellipsis phrases used in syntactically peripheral positions, such as What else (330D)) and numbers in NPs of measurement, addresses or dates (en agosto diecinueve nineteen eighty eight, y… 333D).

Table 11:

Part of speech of lexemes in singe-word strings.

common

nouns
proper

nouns
adj. disc. marker verbs adverbs prep. conj & compl other totala
Count 561 96 86 33 36 26 5 6 2 851
% of category 65.9 11.3 10.1 3.9 4.2 3.1 0.6
  1. aNote that the total sum of lexemes used in single-word strings (n = 851, this table) and multi-word strings (n = 1,049, Table 12) does not total to the sum of lexemes in Table 10. Adding lexemes in single words together and those in multi-word strings equals 1900, while Table 10 sums to 1808. This is because a single lexeme, such as high school, will be counted only once for the whole sample for Table 10, but will be counted once in each part of speech when it appears alone and again for each part of speech it plays in multi-word strings. So, high school in a phrase like me gusta el high school (noun in a single-word striing), el high school equivalente (adjective in a single-word string), high school diploma (adjective in a multi-word) and Catholic high school (noun in a multi-word string) will add a count of four to totals for Tables 11 and 12. In Table 10 those instances will count twice (once as a noun and once as an adjective).

Table 12:

Part of speech of lexemes in of multi-word strings.

common

nouns
proper

nouns
adj. disc. marker verbs adverbs prep. conj & compl other totala
Count 281 361 284 43 17 28 13 2 20 1,049
% of category 26.8 34.4 27.1 4.1 1.6 2.7 1.2 0.2 1.9 100
  1. aSee note in Table 11.

Table 13:

Lexeme-type count for ‘sharedness’ measures for sample as a whole.

nonshared periodic (use by 2–4 people) shared total
nonce (used 1x) idiosyncratic (used multiple times by only 1 person) recurrent (use by 5–7 people) widespread (used by 8+ people)
Count 980 213 375 44 47 1659a
Percent (%) 59.0 12.8 22.6 2.7 2.8 100
  1. aNote that the lexeme count in this table does not reflect the count in Table 6. Table 13 shows the count for each unique lexeme used in the group as a whole. Thus any use of ‘high school’ uttered by any amount of people, for example, adds one to the count of its sharedness type (‘widespread’). Table 6 shows the lexeme count in each category as a sum of those used by each participant. Thus, if ‘high school’ is used by five participants (uttered any amount of times), it adds five to its sharedness rate. Note also that the total lexeme count does not equal that displayed as the sum of lexemes by part of speech (Table 10). That is because when counting for part of speech, a string such as high school will be counted once for all instances where it occurs as a noun and once when it functions as an adjective. However, if high school is a widespread word-form (and it is), the string will only be counted as a single instance of the lexeme in this table. In other words, there are 149 instances where a lexeme is used as more than one part of speech.

References

Atkinson, J. Maxwell & John Heritage. 1999. Structures of social action: Studies in conversation analysis. Aphasiology 13(4/5). 243–249. https://doi.org/10.1080/026870399402073.Search in Google Scholar

Bogdanova-Beglarian, Natalia. 2017. Phrase breaks in everyday conversations from sociolinguistic perspective. In Polina Eismont, Asya Pereltsvaig & Olga Mitrenina (eds.), Language, music and composing (Communications in Computer and information science 943), 122–130. Cham, Switzerland: Springer.10.1007/978-3-030-05594-3_10Search in Google Scholar

Candea, Maria, Iona Vasilescu & Martine Adda-Decker. 2005. Inter- and intra-linguistic acoustic analysis of autonomous fillers. In Estelle Campione, Sandrine Henry, Sandra Teston & Jean Véronis (eds.), Proceedings of DiSS’05, Disfluency in spontaneous speech workshop, 47–51. Edinburgh: The Disfluency Hub: Queen Margaret University.Search in Google Scholar

Cenoz, Jasone. 2000. Pauses and hesitation phenomena in second language production. Review of Applied Linguistics 127/128. 53–69. https://doi.org/10.1075/itl.127-128.03cen.Search in Google Scholar

Chafe, William. 1980. The pear stories. Norwood, NJ: Ablex.Search in Google Scholar

Crible, Ludivine. 2018. Discourse markers and (dis)fluency: Forms and functions across languages and registers. Amsterdam: John Benjamins.10.1075/pbns.286Search in Google Scholar

Deschamps, Alain. 1980. The syntactical distribution of pauses in English spoken as a second language by French students. In Hans W. Dechert & Manfred Raupach (eds.), Temporal variables in speech: Studies in honor of Frieda Goldman-Eisler, 255–262. Berlin: De Gruyter.10.1515/9783110816570.255Search in Google Scholar

Dumont, Jenny. 2010. Testing the cognitive load hypothesis: Repair rates and usage in a bilingual community. Studies in Hispanic and Lusophone Linguistics 3. 329–352. https://doi.org/10.1515/shll-2010-1078.Search in Google Scholar

Dunkan, Starkey & Donald Fiske. 1977. Face-to-face interaction: Research, methods, and theory. New York: Lawrence Erlbaum Associates.Search in Google Scholar

Erker, Daniel & Joanna Bruso. 2017. Uh, bueno, em.: Filled pauses as a site of contact-induced change in Boston Spanish. Language Variation and Change 29. 205–244. https://doi.org/10.1017/s0954394517000102.Search in Google Scholar

Fehringer, Carol & Christina Fry. 2007. Hesitation phenomena in the language production of bilingual speakers: The role of working memory. Folia Linguistica 41(2). 37–72. https://doi.org/10.1515/flin.41.1-2.37.Search in Google Scholar

Fox Tree, Jean E. 1995. The effects of false starts and repetitions on the processing of subsequent words in spontaneous speech. Journal of Memory and Language 34. 709–738. https://doi.org/10.1006/jmla.1995.1032.Search in Google Scholar

Fox Tree, Jean E. 2003. Disfluencies in spoken language. In Lynn Nadel (ed.), Encyclopedia of cognitive science, Vol. 1, 983–986. London: Nature Publishing Group.Search in Google Scholar

Gafaranga, Joseph & Maria-Carme Torras i. Calvo. 2001. Language versus Medium in the study of bilingual conversation. International Journal of Bilingualism 5(2). 195–219.10.1177/13670069010050020401Search in Google Scholar

Gardner-Chloros, Penelope & Malcolm Edwards. 2004. Assumption behind grammatical approaches to code-switching: When the blueprint is a red herring. Transactions of the Philological Society 102(1). 103–129. https://doi.org/10.1111/j.0079-1636.2004.00131.x.Search in Google Scholar

Gardner-Chloros, Penelope, Lisa McEntee-Atalianis & Marilena Paraskeva. 2013. Code-switching and pausing: An interdisciplinary study. International Journal of Multilingualism 10. 1–26. https://doi.org/10.1080/14790718.2012.657642.Search in Google Scholar

Grosjean, Jean & Alain Deschamps. 1975. Analyse contrastive des variables temporelles de l’anglais et du français: Vitesse de parole et variables composantes, phénomènes d’hésitation [Contrastive analysis of temporal variables in English and French: Speech speed and component variables, hesitation phenomena]. Phonetica 31. 144–184. https://doi.org/10.1159/000259667.Search in Google Scholar

Haegeman, Liliane. 1994. Introduction to government and binding theory, 2nd edn., Malden, MA: Blackwell Publishing.Search in Google Scholar

Haegeman, Lillian & Jaqueline Guéron. 1999. English grammar: A generative perspective. Malden, MA: Blackwell Publishing.Search in Google Scholar

Hlavac, Jim. 2011. Hesitation and monitoring phenomena in bilingual speech: A consequence of code-switching or a strategy to facilitate its incorporation? Journal of Pragmatics 43. 3793–3806. https://doi.org/10.1016/j.pragma.2011.09.008.Search in Google Scholar

Jaspers, Jürgen. 2016. (Dis)fluency. Annual Review of Anthropology 45. 147–162. https://doi.org/10.1146/annurev-anthro-102215-100116.Search in Google Scholar

Kasl, Stanislav V. & George F. Mahl. 1965. The relationship of disturbances and hesitations in spontaneous speech to anxiety. Journal of Personality and Social Psychology 1(5). 425–433. https://doi.org/10.1037/h0021918.Search in Google Scholar

Labov, William. 1963. The social motivation of a sound change. Word 19. 273–309. [Also: 1972. Chapter 1. Sociolinguistic patterns. 1–42. Philadelphia, PA: University of Pennsylvania Press]. https://doi.org/10.1080/00437956.1963.11659799.Search in Google Scholar

Labov, William. 1966. The social stratification of English in New York City. Washington D.C.: Center for Applied Linguistics.Search in Google Scholar

Labov, William. 2001. Principles of linguistic change Vol. 2: Social factors. Oxford: Blackwell Publishing.Search in Google Scholar

Levelt, Willem. 1983. Monitoring and self-repair in speech. Cognition 14. 41–104. https://doi.org/10.1016/0010-0277(83)90026-4.Search in Google Scholar

Levelt, Willem. 1989. Speaking: From intention to articulation. Cambridge, MA: MIT Press.10.7551/mitpress/6393.001.0001Search in Google Scholar

Levinson, Stephen C. 1983. Pragmatics. Cambridge: Cambridge University Press.Search in Google Scholar

Levinson, Stephen C. 2002. Pragmatics. [Cambridge Textbooks in Linguistics.]. Cambridge: Cambridge University Press.Search in Google Scholar

Lyons, John. 1977. Semantics, Vol. 1 & 2. Cambridge, MA: Cambridge University Press.Search in Google Scholar

Maclay, Howard & Charles Osgood. 1959. Hesitation phenomena in spontaneous English speech. Word 15(1). 19–44. https://doi.org/10.1080/00437956.1959.11659682.Search in Google Scholar

Muysken, Pieter. 1995. Code-switching and grammatical theory. In Lesley Milroy & Pieter Muysken (eds.), One speaker, two languages: Cross-disciplinary perspectives on code-switching, 177–198. Cambridge: Cambridge University Press.10.1017/CBO9780511620867.009Search in Google Scholar

Muysken, Pieter. 2000. Bilingual speech: A typology of code-mixing. Cambridge: Cambridge University Press.Search in Google Scholar

Oppenheim, Abraham Naftali. 1992. Questionnaire design, interviewing and attitude measurement. London: Pinter Publishers.Search in Google Scholar

Otheguy, Ricardo, Ofelia García & Wallis Reid. 2015. Clarifying translanguaging and deconstructing named language: A perspective from linguistics. Applied Linguistics Review 6(3). 281–307. https://doi.org/10.1515/applirev-2015-0014.Search in Google Scholar

Otheguy, Ricardo & Ana Celia Zentella. 2012. Spanish in New York: Language contact, dialectal leveling, and structural continuity. New York, NY: Oxford University Press.10.1093/acprof:oso/9780199737406.001.0001Search in Google Scholar

Pfaff, Carol. 1979. Constraints on language mixing: Intrasentential code-switching and borrowing in Spanish/English. Language 55(2). 291–318. https://doi.org/10.2307/412586.Search in Google Scholar

Pomerantz, Anita. 1984. Agreeing and disagreeing with assessments: Some features of preferred/dispreferred turn shapes. In J. Maxwell Atkinson & John Heritage (eds.), Structures of social action: Studies in conversation analysis, 57–102. Cambridge: Cambridge University Press.10.1017/CBO9780511665868.008Search in Google Scholar

Poplack, Shana. 1980. Sometimes I’ll start a sentence in Spanish y termino en español. Linguistics 18. 581–618. https://doi.org/10.1515/ling.1980.18.7-8.581.Search in Google Scholar

Poplack, Shana. 1985. Contrasting patterns of code-switching in two communities. In Papers from the 5th International Conference on Methods in Dialectology, 51–77. University of Victoria (BC) Victoria, BC, Canada: Dept. of Linguistics, University of Victoria. http://hdl.handle.net/10315/6618.Search in Google Scholar

Poplack, Shana. 1987. Contrasting patterns of code-switching in two communities. In Erling Wande, Jan Anward, Bengt Nordberg, Lars Steensland & Mats Thelander (eds.), Aspects of multilingualism, 51–77. Uppsala: Borgströms.Search in Google Scholar

Poplack, Shana, David Sankoff & Christopher Miller. 1988. The social correlates and linguistic consequences of lexical borrowing and assimilation. Linguistics 26. 47–104. https://doi.org/10.1515/ling.1988.26.1.47.Search in Google Scholar

Poplack, Shana, Susan Wheeler & Anneli Westwood. 1987. Distinguishing language-contact phenomena: Evidence from Finnish-English bilingualism. In Pirkko Lilius & Mirja Saari (eds.), The Nordic languages and modern linguistics: Proceedings of the Sixth International Conference of Nordic and General Linguistics (1986 August 18–22, Helsinki), 33–56. Helsinky: Helsinki University Press. [Also: 1989. World Englishes 8(3). 389-406].Search in Google Scholar

Poulisse, Nanda. 1997. Language production in bilinguals. In Annette M. B. de Groot & Judith Kroll (eds.), Tutorials in bilingualism: Psycholinguistic perspectives, 201–224. Mahwah, NJ: Lawrence Erlbaum Associates.Search in Google Scholar

Rosignoli, Alberto. 2011. Flagging in English-Italian code-switching. Bangor, U.K: Bangor University dissertation.Search in Google Scholar

Schegloff, Emanuel, Gail Jefferson & Harvey Sacks. 1977. The preference for self-correction in the organization of repair in conversation. Language 53(2). 361–382. https://doi.org/10.2307/413107.Search in Google Scholar

Seliger, Herbert W. 1980. Utterance planning and correction behavior: Its function in the grammar construction process for second language learners. In Hans W. Dechert & Manfred Raupach (eds.), Towards a cross-linguistic assessment of speech production, 87–99. Frankfurt: Lang.Search in Google Scholar

Stenström, Anna-Brita. 2011. Pauses and hesitations. In Gisle Andersen & Karin Aijmer (eds.), Pragmatics of society, 537–567. Berlin: De Gruyter Mouton.10.1515/9783110214420.537Search in Google Scholar

Tottie, Gunnel. 2011. Uh and um as sociolinguistic markers in British English. International Journal of Corpus Linguistics 16(2). 173–197. https://doi.org/10.1075/ijcl.16.2.02tot.Search in Google Scholar

Varra, Rachel. 2018. Lexical borrowing and deborrowing in Spanish in New York city: Towards a synthesis of the social correlates of lexical use and diffusion in immigrant contexts. New York: Routledge.10.4324/9781315408941Search in Google Scholar

Varra, Rachel. 2020. Conversational recasting in New York City: Making global processes local. In Andrew Lynch (ed.), Spanish in the global city, 105–137. New York: Routledge.10.4324/9781315716350-5Search in Google Scholar

Varra, Rachel. 2025. Sociolinguistic dimensions of flagging behavior: The case of Spanish-English bilinguals in New York City. In Cecilia Montes-Alcalá, Miguel García & Chad Howe (eds.). Spanish sociolinguistics in the 21st century: Current trends and methodologies, 33–67. Amsterdam: John Benjamins.Search in Google Scholar

Vaughan, Jill & Won Lee. 2020. The ordinariness of translinguistics in indigenous Australia. In Jerry, Sender Dovchin & Won Lee (eds.), Translinguistics: Negotiating innovation and ordinariness, 90–103. London: Routledge.10.4324/9780429449918-8Search in Google Scholar

Vergara Wilson, Damián & Jenny Dumont. 2015. The emergent grammar of bilinguals: The Spanish verb hacer ‘do’ with a bare English infinitive. International Journal of Bilingualism 19(4). 444–458. https://doi.org/10.1177/1367006913516047.Search in Google Scholar

Wei, Li. 2018. Translanguaging as practical theory of language. Applied Lingusitics 39(1). 9–30. https://doi.org/10.1093/applin/amx039.Search in Google Scholar

Published Online: 2025-04-30
Published in Print: 2025-05-26

© 2025 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 29.11.2025 from https://www.degruyterbrill.com/document/doi/10.1515/shll-2025-2008/html?lang=en
Scroll to top button