A coding scheme for request for confirmation sequences across languages

Katharina König; Martin Pfeiffer; Kathrin Weber

doi:10.1515/opli-2025-0056

Article Open Access

A coding scheme for request for confirmation sequences across languages

Katharina König , Martin Pfeiffer and Kathrin Weber

Published/Copyright: July 9, 2025

Published by

Become an author with De Gruyter Brill

Author Information Explore this Subject

From the journal Open Linguistics Volume 11 Issue 1

Abstract

This article is part of a special issue on cross-linguistic analyses of request for confirmation (RfC) sequences in ten languages. It presents the coding scheme that forms the basis of the comparative study of RfCs in mundane conversation. First, we outline the criteria applied to ensure comparability of data and describe the procedures for building a collection of 200 RfC sequences for each language. We then introduce the coding categories for the RfC turn and the subsequent response. The article also reports on the methods for testing and securing inter-coder reliability in our cross-linguistic project.

Keywords: coding; inter-coder reliability; comparative interactional linguistics; requests for confirmation; response design; polar questions; epistemics; pragmatic typology

1 Introduction

In requesting confirmation, speakers claim partial knowledge about a matter-at-hand while at the same time ascribing higher epistemic access, responsibility, or rights to the addressee (among others, Bolden 2010, Heritage 2012, König and Pfeiffer 2024, Raymond and Stivers 2016). By doing this, they make a response relevant in which the addressee (minimally) confirms or disconfirms (for instance, Seuren and Huiskes 2017, Gubina et al. 2024, Steensig and Heinemann 2013). Request for confirmation (RfC) sequences, therefore, provide an important sequential context for establishing and maintaining intersubjectivity and negotiating epistemic stances.

To date, little is known about how languages differ in the design of turns-at-talk that request confirmation and of turns that answer this particular questioning stance (but see Stivers et al. 2010, Bolden et al. 2023 for comparative studies of polar questions that include RfCs among other actions). The special issue ‘Request for confirmation sequences across ten languages’, in which this article is included, offers a first systematic comparative analysis of their use in mundane conversation. It presents results from a project conducted by the Scientific Network ‘Interactional Linguistics’,^[1] an international group of researchers of grammar in interaction in American and British English, Castilian Spanish, Czech, Egyptian Arabic, German, Hebrew, Korean, Low German, Mandarin Chinese, and Yurakaré. The network’s analytic procedures are rooted in the conversation-analytic and interactional-linguistic study of language in social interaction (Couper-Kuhlen and Selting 2018, Fox et al. 2013, Schegloff 2006) from a pragmatic-typological perspective (Dingemanse et al. 2014, Floyd et al. 2020). In taking a comparative angle on RfCs, we are interested in the complex interplay of language and action (Sidnell and Enfield 2012), i.e. how the particular resources a language provides shape the accomplishment of RfCs and their responses, and how requesting confirmation (and the sequential trajectory it involves) shapes the linguistic formatting of such turns (König and Pfeiffer forthcoming b give a more detailed description of the project’s objectives). This article provides an overview of the procedures for collecting and coding RfC sequences across languages that form the basis for the studies presented in this issue.

Coding naturally occurring conversations inevitably involves a ‘sketchy’ account of the intricacies of highly contextualised social conduct (Stivers 2015 for a methodological reflexion). Our cross-linguistic endeavour, however, made it necessary to not only develop a generic definition of RfCs that is not based on language-specific categories (König and Pfeiffer forthcoming b) but also to find commensurable ‘concepts’ by which RfCs in typologically diverse languages can be compared adequately (Haspelmath 2010, König and Pfeiffer forthcoming a). The coding scheme, as we present it here, is the outcome of a reflexive research process grounded in iterative qualitative analyses and joint data sessions.

By now, it has become an established practice for cross-linguistic interactional research projects to make their coding schemes accessible in order to provide a basis for replicating, refining, and enhancing the analytic procedures (see, for instance, Dingemanse et al. 2016, Küttner et al. 2023, Floyd et al. 2020, Stivers and Enfield 2010). In line with these publications, the present article’s main aim is to provide a descriptive account of the analytic concepts that form the basis of the study of RfCs in ten languages presented in this issue. In the following sections, we outline the network’s rationale in building comparative collections, i.e. we introduce a brief overview of the sampling procedures applied (Section 2). We then present the scheme for coding RfC sequences as it was developed in the course of the network (Section 3). In Section 4, we describe the procedures applied for conducting an inter-coder reliability (ICR) check and discuss the results. Section 5 concludes with a reflection on the methodological value of the coding scheme.

2 Comparable datasets, collections, and codings

For our cross-linguistic study of RfC sequences, we relied on pre-existing corpora of naturally occurring interactional data that had already been collected beforehand as part of other research projects. This required us to develop criteria with which to identify conversations that are as comparable as possible in terms of the interlocutors’ social relationship, the interactional setting, the activities pursued (Section 2.1), and the modes of data representation (Section 2.2). Moreover, as we are interested in the relative importance of particular linguistic resources for building RfCs and responding to them, we had to develop a procedure for sampling RfCs from the corpora that enables a quantitative comparison (Section 2.3). All coding categories have been jointly developed in data sessions and adjusted based on iterative qualitative analyses and were then tested for ICR (Section 4 provides a detailed description of the statistical procedures).

2.1 Comparability of the data

In order to ensure the comparability of the conversational data used for our study, we decided to include recordings from similar interactional situations and recording settings. We did not pre-select specific activity contexts in which RfCs are to be studied.^[2] Instead, our approach seeks to explore RfCs as a generic action for obtaining confirmation or disconfirmation in casual everyday encounters. Ideally, data should cover

mundane, casual, open-ended conversations (this excludes task-based institutional communication or media talk, experimental or otherwise elicited talk with predetermined topics or a pre-given time frame),
among acquainted adults (2–4 participants, preferably friends and family),
focussed (no long silences or adjournments, no multi-activity settings; i.e. conversation should be the main ‘task’ at hand),
in co-presence (preferably in a non-mobile setting such as sitting at the dinner table, on a sofa, etc.),
and video-recorded.

If it was not possible to obtain all RfC instances from such conversations, we allowed for the inclusion of other settings or activity contexts along the following preference scheme:

If not all of the aforementioned criteria could be met, researchers were asked to use video recordings of conversations among adults in a non-mobile setting (friends or family, preferably 2–4 interlocutors) in which participants deal with a particular task (e.g., planning a trip, preparing a presentation, cooking dinner, etc.).
If this was not possible, researchers were asked to use audio recordings of telephone calls among adults (friends or family) in which participants do not deal with a particular task but rather engage in casual, open-ended talk in a non-mobile setting.
If this was not possible, researchers were asked to use audio recordings of telephone calls among adults (friends or family) in which participants deal with a particular task (e.g., planning a trip or organising a birthday party) in a non-mobile setting.
If this was not possible, researchers were asked to use audio recordings of face-to-face conversations among adults (friends or family, preferably 2–4 interlocutors) in which participants do not deal with a particular task but rather engage in casual, open-ended talk in a non-mobile setting.
If this was not possible, researchers were asked to use audio recordings of face-to-face conversations among adults (friends or family, preferably 2–4 interlocutors) in which participants deal with a particular task (e.g., planning a trip, preparing a presentation, cooking dinner, etc.) in a non-mobile setting.

We developed this preference scheme as a method for ensuring comparability in working with data sets that had already been recorded prior to our comparative endeavour. This allowed us to include languages for which there is only limited documentation of non-task-based casual conversation.

2.2 Notes on transcription, glossing, and translation

In transcribing the data, we did not work with a unified system but rather asked network members to use the standards that have already been established for their particular language. Bodily resources should be transcribed if they were relevant for accomplishing the RfC sequence. Transcriptions also include a glossing for the target line and an English translation. This task warrants a note of caution in a cross-linguistic study that seeks to explore language-specific practices in RfC sequences. In English, speakers use interrogative and declarative syntax for distinguishing different epistemic stances (Heritage 2012), while other languages do not. Moreover, finding near-equivalent translations for conventionalised forms such as change-of-state tokens, tags, and response tokens is not always possible as the English paradigms cannot be directly mapped onto other languages (see Betz and Golato 2008 for a brief discussion). The translations of the transcripts provided in the articles in this issue should therefore only be read as close approximations to the practices under study.

2.3 Sampling RfCs from our data

As outlined above, RfCs are characterised by a recipient-tilted epistemic gradient with a relatively shallow slope (König and Pfeiffer forthcoming b). With RfCs, speakers position themselves as partially knowing and introduce a new confirmable proposition to the conversation, but ascribe primary epistemic access, rights, or responsibility to the addressee, making a confirming or disconfirming response relevant. Excerpt (1) is a prototypical example for an RfC sequence in Czech. NOR announces that she has to take care of the laundry. Following his go-ahead response, JAN presents an inference from NOR’s prior turn as a confirmable and adds the tag jo? (‘right’), thereby assigning NOR primary epistemic rights.

Building on this definition, researchers were asked to collect approximately 200 RfC sequences from continuous stretches of talk (see also Dingemanse et al. 2016, 36 for a similar procedure). At the same time, it was deemed important that we minimised a potential speaker bias and avoided an overrepresentation of particular conversations which (for whatever reason^[3]) contain more RfCs than others. We therefore decided to collect a maximum of 15 RfC sequences per conversation (starting at minute five of the recording). If there were fewer instances in a recording, researchers moved on to the next conversation.

3 Coding scheme

All coders used a custom-made Excel sheet that provided separate lines for each RfC and separate columns for all categories to be coded (with a pre-set list of options to make sure that variables were used consistently). Each line contained a comment field in which coders could account for their decisions or make further remarks about the RfC in question. The following sections give an overview of the categories that were used in the cross-linguistic study of RfC sequences presented in this special issue. We start with the basic information needed to itemise each RfC sequence (A). We then present a list of categories and variables as we operationalised them for coding RfC turns (B) and subsequent responses (C).

3.1 A: Basic data

The documentation of the following basic data for each RfC sequence primarily aims to facilitate access to the sequences under study. It allows for a general overview of the collected cases.

A1: Language. Each of the languages/varieties under study was referenced by an abbreviation (AE: American English | ARA: Egyptian Arabic | BE: British English | CHI: Mandarin Chinese | CZ: Czech | GER: German | HEB: Hebrew | KOR: Korean | LoG: Low German | SPA: Castilian Spanish | YUR: Yurakaré).^[4]
A2: Data identifier. Coders could choose an individual notation which allowed them to locate the RfC sequence in their data (e.g., IDs for corpora, for individual conversations, time codes, direct links to the examples). Every RfC sequence was listed in a separate line.
A3: Speaker. In order to document possible individual variation in the production of RfCs and their responses, coders provided some form of speaker identification. This was done according to the conventions of the respective corpora as long as each speaker was assigned a distinct form of speaker ID.
A4: Transcript of RfC. Coders were asked to provide a transcribed version of the RfC in the original language using the transcription system already developed for that language.
A5: Translation of RfC. Coders were asked to provide an English translation of the RfC (if applicable).
A6: Transcript of Response(s). Coders were asked to provide a transcribed version of the response(s) to the previous RfC in the original language using the transcription system already developed for that language. If the RfC in question did not receive any uptake, this was indicated by a short description: ([no response]). In the case of responses by more than one speaker in multi-party interactions, coders could include all responses in this cell; only one of these responses, however, was used as a basis for coding (see category C1).
A7: Translation of Response(s). Coders were asked to provide an English translation of the response (if applicable).

The actual coding of the sequences is not based on the decontextualised transcription of the RfCs and their responses, which only provide a reduced representation of natural conversations (Hepburn and Bolden 2013). Rather, coding was “based on the fullest representation of the data available” (Dingemanse et al. 2016, 37) and on fine-grained analyses of the RfC sequences in their respective contexts of use.

3.2 B: Coding requests for confirmation

This section presents an overview of the variables as they were operationalised for coding the RfC turn. It makes reference to relevant interactional research to illustrate how our approach has been informed by previous analytic procedures but also to point out where we had to diverge to ensure applicability to all languages under study.

3.2.1 B1: Syntactic complexity

Which syntactic design does the confirmable exhibit in terms of its complexity?

The variable syntactic complexity [B1] captures basic aspects of the syntactic design, i.e. differences in the syntactic complexity that the confirmable of the RfC exhibits (Section 2.3).

– Clause

The confirmable is realised in a full-blown clausal/sentential form that contains a verb. Note that this category captures declaratives but also insubordinate clauses, as well as other complex sentence formats.

– Phrase

The confirmable is realised as a phrase that consists of one or more lexical items without a verb.

– Other

The confirmable consists of more than a single phrase, but it does not exhibit a syntactically full-blown structure (such as ellipses or anacolutha).

3.2.2 B2: Syntactic formats

Which syntactic formats are used for formulating RfCs?

We added another open-class variable without a pre-set list of options in order to account for language-specific ways of formulating RfCs. For the variable syntactic formats [B2], coders were asked to indicate different syntactic structures that proved to be relevant in their specific language (such as verb-first vs verb-second sentences in English (Küttner and Szczepek Reed 2024) or Low German (Weber 2024)). Due to the large variety of language-specific coding options, this variable is not examined in the cross-linguistic paper comparing RfCs in ten different languages (Pfeiffer et al. forthcoming) and is not included in the ICR check (Section 4). Some articles in the special issue nevertheless address this variable for a more detailed account of the RfC formatting in their specific languages.

3.2.3 B3: Polarity

In which polarity is the RfC cast?

Like the variables syntactic complexity [B1] and syntactic formats [B2], this category is also concerned with the morphological or syntactic design of the confirmable. It asks whether an RfC is formatted with positive or negative polarity, as this may have an impact on the design of confirming or disconfirming responses (Heritage and Raymond 2021, Sadock and Zwicky 1985). The syntactic marking of negative polarity can vary from language to language. It can be done with forms such as negation articles (such as kein (‘no’) in keine Kohle (‘no money’) in German, Deppermann et al. 2024), negation particles (such as miš (‘not’) in Egyptian Arabic, Marmorstein 2024), or it can be marked by a bound morpheme such as post-verbal affixes in Korean (Kim 2024b). We did not include markings that operate on a lexical level (such as English satisfied vs dissatisfied). All RfCs with no explicit marker for negative polarity (including phrasal RfCs such as Arabic ʔāxir šahrĭ ʔuktobar, ‘at the end of October’, Marmorstein 2024) were coded as positive. In sum, this variable is primarily form-oriented in its design. We did not include action-based interpretations of polarity, like counter-oriented argumentative pragmatic implicatures, for example.

– Negative polarity

This code is used if the RfC contains a negative polarity item.

– Positive polarity

This code is used if the RfC does not contain a negative polarity item.

3.2.4 B4: Type of negative polarity

Which type of negative polarity marking is used?

Similar to the coding of syntactic complexity [B1], the coding of polarity [B3] provides a restricted set of coding options, which is unspecific regarding the respective language-specific options. Thus, if the variable polarity [B3] was coded as ‘negative polarity’, coders were asked to list the negative polarity items used in the RfC in this variable [B4]. As resources differ between languages, there was no pre-set list of options. If the polarity of the confirmable was positive, the cells were coded as ‘NA’ (not available). Like the variable syntactic formats [B2], this variable is not included in the quantitative cross-linguistic comparison (Pfeiffer et al. forthcoming), but was relevant for some of the language-specific studies (see e.g., Kim 2024a).

3.2.5 B5: Modulation

Is the confirmable modulated concerning the requester’s stance towards the confirmable’s validity?

In this category, we coded for the lexical (e.g., modal particles, modal adverbs, modal verbs in epistemic use, vagueness markers or approximators), morphological (e.g., subjunctive verb mode) and syntactic resources (e.g., conditional clauses) that requesters use to modulate the RfC and therefore position themselves with regard to the subjectively assumed validity of the confirmable (epistemic stance). These markers can mitigate or reinforce the requester’s commitment to the validity of the confirmable (such as the commitment marker =la in Yurakaré, Gipper 2024). Hedging devices, which serve to mark approximation or imprecise reference (e.g., in the Czech RfC to prostě- to je trošku jako krb; jako kamna; ‘that simply it’s a bit like a fireplace; like a stove;’ Oloff 2024), are also coded as modulations. RfCs formatted as negative interrogatives can be used to put up a positive assertion for confirmation (Heritage 2002, Hentschel 1996, such as in English: “Wasn’t your phone broken for a while?,” Küttner and Szczepek Reed 2024). Such instances of non-propositional negation were also coded as a type of modulation. If a confirmable contains modulation markers, coders were asked to list the markers. For the cross-linguistic analysis by Pfeiffer et al. (forthcoming), the variable [B5] was later transformed into a binary dummy variable consisting of a numerical ‘0–1’ code.

– Modulation marker

If a confirmable contains one or several modulation markers, the items are coded as characters.

– No modulation

This code is used if a confirmable is not modulated.

3.2.6 B6: Inference marking

Is the RfC marked as an inference from prior talk?

This variable refers to linguistic resources speakers deploy to mark the confirmable as inferred from prior talk (upshots, candidate understandings, formulations; see Heritage and Watson 1979, Zinken and Küttner 2022). While there may be RfCs in which speakers present an inference without explicitly framing it as such, here we were only interested in coding those forms which explicitly mark the confirmable as an assumption or upshot which is inferred from what the other speaker has said before (see König and Pfeiffer forthcoming a for a methodological reflection). These markers include, among others, expressions such as Hebrew ‘az (‘so/then’, Ben-Moshe and Maschler 2024), Mandarin jiu(shi) (‘just’, Li 2024), Spanish pues (‘so/because’, Ehmer 2025) or change-of-state tokens if they indicate that new information has been inferred from prior talk (as opposed to marking something as ‘just remembered’, see e.g., Betz and Golato 2008). Comparable to the coding of modulations [B5], coders were asked to list an inference marker, if an RfC consists of some form of overt inference marking. For the quantitative cross-linguistic analysis (Pfeiffer et al. forthcoming), this character variable was also transformed into a dummy variable by replacing the characters with a binary ‘0–1’ code.

– Inference marker

If an RfC contains forms of overt inference marking, the items are coded as characters.

– Without inference marking

This code is used if an RfC does not contain any overt inference marking.

3.2.7 B7: Connectives

Does the RfC contain connective devices?

In this category, we coded for connective devices, such as conjunctions and conjunctive adverbs (e.g., Hebrew ‘aval ‘but’, Ben-Moshe and Maschler 2024 or Czech tak ‘so/then’, Oloff 2024). In contrast to the established linguistic concept of connectives, the present study also subsumes discourse markers (such as English well, Küttner and Szczepek Reed 2024, or Arabic ṭayyib ‘okay’, Marmorstein 2024) and change-of-state tokens (e.g., German achso ‘I see’, Deppermann et al. 2024) under the category of connectives as they link the RfC to previous discourse. Inference markers in [B6] are also coded as connectives in this category. Similar to the categories modulation [B5] and inference marking [B6], if an RfC contains connective devices, coders were asked to list them. For the quantitative comparison (Pfeiffer et al. forthcoming), this character variable was transformed into a dummy variable by replacing the characters with a binary ‘0–1’ code.

– Connective item(s)

If an RfC contains connective devices, the items are coded as a character.

– Without connective(s)

This code is used if an RfC does not contain any connective devices.

3.2.8 B8: Tag

Is the RfC appended by a tag?

This category documents the presence or absence of tags in RfCs, understood as more or less formulaic expressions appended to an RfC that put the confirmable up for debate. Tags may function as epistemic stance markers that mobilise response (Enfield et al. 2012, Stivers and Rossano 2010). However, they can also express affective stances (for Egyptian Arabic see Marmorstein 2024, for Korean see Kim 2024b), which is why the common expression ‘question tag’ seems not to be suitable for all languages. They differ from grammaticised final particles (as, for instance, the question particle -ma in Mandarin, Li 2024) in that they are optional. While the category as we operationalise it here is open to different forms (such as reversed polarity tags like isn’t it in English (Küttner and Szczepek Reed 2024) or particles such as Czech jo ‘yes’ or ne ‘no’, Oloff 2024) and different degrees of prosodic (dis)integration (in contrast to other definitions of final particles, e.g., Hancil et al. 2015), it is specified in terms of its position as an RfC-final element, i.e. we are interested in syntactically peripheral or post-positioned resources and their interactional import. Unlike turn-final conjunctions, tags do not refer to ‘unverbalised’ propositions, e.g., to ‘hanging implications’ (Mulder and Thompson 2008) or to expandable utterances (Koivisto 2012). Again, if an RfC includes a tag, coders were asked to list the tag as a character. The cross-linguistic quantitative analysis (Pfeiffer et al. forthcoming) turns this variable into a binary ‘0–1’ coding.

– Tag

If an RfC contains a tag, the tag is coded as a character.

– Without a tag

This code is used if an RfC does not contain a tag.

3.2.9 B9: Prosodic integration of tag

If there is a tag: Is the tag prosodically integrated into the RfC?

If the RfC contains a tag, this category coded if the tag is prosodically integrated into the RfC (i.e. confirmable and tag are part of one intonation contour) or if it is prosodically disintegrated (i.e. the tag is realised in a separate intonation contour). There may be pauses between confirmable and tag, but these were not criterial for a tag to be coded as disintegrated; it is also possible that latching occurs between the two components.

– Integrated

This code is used if the tag is prosodically integrated into the confirmable (if they are part of one intonation contour).

– Disintegrated

This code is used if the tag is not prosodically integrated into the confirmable (if they form separate intonation contours).

– NA

This code is used if the RfC does not contain a tag.

3.2.10 B10: Final intonation confirmable

What is the confirmable’s final pitch movement?

In this category, we coded for the final pitch movement of the confirmable (including those RfCs with a prosodically integrated tag, see variable prosodic integration of tag [B9]). If a tag is appended in a separate intonation contour, its prosodic design was coded separately (see variable final intonation tag [B11]). If a confirmable is realised in more than one intonation phrase, only the final pitch movement of the last intonation phrase was coded.

– Rise

This code is used if a confirmable is deployed with a rise-to-mid or a rise-to-high pitch contour.

– Level

This code is used if a confirmable is deployed with a level pitch contour.

– Fall

This code is used if a confirmable is deployed with a falling-to-mid and falling-to-low pitch contour.

3.2.11 B11: Final intonation tag

If there is a tag in a separate contour: What is the tag’s final pitch movement?

If the RfC contains a tag in a separate intonation contour, the final pitch movement of the tag was coded separately.

– Rise

This code includes rise-to-mid and rise-to-high pitch movements.

– Level

This code captures level pitch movements.

– Fall

This code includes fall-to-mid and fall-to-low pitch movements.

– 0

This code is used if an RfC contains a tag but without a separate contour.

– NA

This code is used if the RfC does not contain a tag.

3.3 C: Coding responses to requests for confirmation

This section presents an overview of the variables operationalised for coding responses to RfCs. Again, we refer to previous studies to contextualise our procedures in current research.

3.3.1 C1: Response

Is there a response?

In order to analyse RfCs as an adjacency pair sequence, we coded whether an RfC receives a response or not. Responses in [C1] consist of verbal only, embodied only, or combinations of verbal and embodied features.

– Yes

This code is applied if a verbal, an embodied, or a verbal and embodied response follows the RfC.

– No

This code is applied if no verbal, embodied, or verbal and embodied response follows the RfC.

3.3.2 C2: Verbal response

Is there a verbal response?

This category documents the occurrence or non-occurrence of a verbal response (or several verbal responses by different participants) to the RfC. It includes all utterances by other participants of the conversation that can be heard as dealing with the RfC, even if they do not directly answer it (so-called non-answer responses, among them repair initiations, hesitation particles, claims of not knowing, Stivers 2022, Chapter 3). Since verbal responses (in contrast to embodied responses [C3]) can be compared across all languages that are part of the research network, the [C2] coding forms the basis for the following NA-codings.

– Yes

This code is used when the RfC receives a verbal response.

– No

This code is used when the RfC does not receive a verbal response (e.g., a purely embodied response ([C3])).

– NA

This code is used if the variable reponse [C1] is tagged as ‘no’.

3.3.3 C3: Embodied response

Is there an embodied response?

In order to be able to determine the relevance of bodily resources in responses to RfCs, coders were asked to identify them in variable [C3], if video recordings were available. Note that we do not include this category in the cross-linguistic quantitative analysis, as some researchers studied telephone conversations or only had access to audio recordings of the conversations investigated. Given that recent studies include head nods or head shakes as type-conforming interjections in their (comparative) quantifications (Enfield et al. 2019, Stivers 2019, 2022), we nevertheless asked network members working with video data to code them to be able to determine their role in responses to RfCs – in particular those without verbal uptake. As various collections show that bodily resources usually co-occur with verbal responses (Oloff 2024) and rarely form a full response on their own (Deppermann et al. 2024), we decided not to include head nods and head shakes as response tokens (see [C6]).^[5]

– Yes

This code is used for (standalone or co-occurring) bodily resources that potentially deal with the conditional relevance established by the RfC (such as head nods, head shakes, shrugs, hand or facial gestures).

– No

This code is used if the RfC does not receive an embodied response.

– NA

This code is used if the variable response [C1] is tagged as ‘no’. Moreover, NA is used for audio-only data or if resources are not clearly identifiable.

3.3.4 C4: Multiple responses

Are there responses from different co-participants in a multi-party conversation?

This variable captures instances in multi-party interactions in which more than one speaker responds or starts to respond to a previous RfC. Here, the coding scheme mainly works as a heuristic instrument. That is, rather than analysing all responding turns and their relationship, the scheme makes ‘multiple responses’ identifiable for future research on speaker selection, response timing, or response design in multi-party interaction. In addition, marking dyadic conversations with a particular code (NA) enables researchers to filter for differences in the participation framework.

If multiple responses occur, we decided to code only one of them. The decision not to study all responses raised the question of which of them should form the basis for further coding. In the case of (slightly) time-delayed responses, coders were asked to code the response that starts first (Stivers and Enfield 2010, 2625). In rare instances in which responding turns start simultaneously, we coded the response of the selected speaker (in particular through gaze, but also address terms, etc.) or, if the selected speaker could not be clearly identified, the speaker with higher epistemic rights to respond. If none of these criteria could be applied, which virtually never happened, the more elaborate response was coded (e.g., if a minimal and a non-minimal response occurred, see [C9], the decision was made in favour of the non-minimal response).

Since some of the languages in our sample were investigated based exclusively on dyadic conversations, we did not include this category in the cross-linguistic analysis (Pfeiffer et al. forthcoming). Therefore, it was also excluded from the ICR test (see Section 4).

– Yes

This code is used if more than one speaker responds or starts to respond to the RfC.

– No

This code is used if only one speaker responds or starts to respond to the RfC.

– NA

This code is used if the variable response [C1] is tagged as ‘no’ or a conversation is dyadic.

3.3.5 C5: Prefacing elements

Does the response contain prefacing elements?

This category captures if the response is introduced by particular prefacing elements such as discourse markers (e.g., Spanish bueno ‘well’, Ehmer 2025). Laughter, audible breathing, or other liminal signs (Dingemanse 2020) were not included in this category, as the recording quality of the different data sets differs. Moreover, response tokens (see [C6]) are not counted as prefacing elements, but are part of a response ‘proper’.

– Yes

This code is used if a response is introduced by (a) prefacing element(s).

– No

This code is used if a response is not introduced by (a) prefacing element(s).

– NA

This code is used if the variables response [C1] and verbal response [C2] are tagged as ‘no’.

3.3.6 C6: Response token

Is the response done with a response token?

A response token is defined as the smallest or minimal conventionalised communicative unit of a language that can provide confirmation or disconfirmation. It is a “dedicated answer form” (Enfield et al. 2019, 289) of a given language that “[does] not assert a proposition in and of [itself] but [does] confirm one” (Enfield et al. 2019, 288). In contrast to Enfield et al. (2019), embodied resources such as nodding or head shakes are not included in the coding of a response token (see [C3]). In this study, a response token is always a verbal (or combined verbal and embodied) entity and has a certain stability of form which does not mirror elements of the RfC (so repeats are excluded, see [C10]). The term ‘token’ recognises that, in some languages, there seems to be a gradual rather than a categorical boundary between non-inflecting monomorphemic expressions (often referred to as particles or interjections) and formats that might still exhibit some traces of inflection or syntactic integration (Gipper 2024, Kim 2024b). Sometimes, several response tokens occur within a single response. If a response contain at least one response token (irrespective of its position in the turn), coders were asked to list them. For the quantitative overview, this was later transformed into a numerical ‘0–1’ code (Section 4).

– Response token

If a (verbal) response contains one or several response tokens, they are coded as a character.

– No (response token)

This code is used if a (verbal) response does not contain a response token.

– NA

This code is used if the variable verbal response [C2] is tagged as ‘no’.

3.3.7 C7: Cluster of response tokens

Is there a cluster of response tokens?

If a response token is identified in [C6], [C7] captures instances in which two or more response tokens are uttered in one prosodic unit. This includes multiple sayings of the same token (Stivers 2004), as long as they are not lexicalised in the given language (e.g., jaja, a double saying of ‘yes’ in German, Golato and Fagyal 2008, as well as uses of different response tokens such as ʔāh ṭabʕan ‘yes of course’ in Arabic, Marmorstein 2024).

– Yes

This code is used if there is more than one response token in a prosodic unit following the RfC.

– No

This code is used if there is only one response token or if two or more tokens are uttered in separate prosodic units.

– 0

This code is used if there is a verbal response [C2] without a response token (e.g., if the response is formatted as a repeat [C10]).

– NA

This code is used if the variable verbal response [C2] is tagged as ‘no’.

3.3.8 C8: Position of response token

In which position within the response turn do speakers use the (first) response token?

If there is a response token [C6], coders were furthermore asked to indicate where in the response turn the token is located (Raymond 2013). If there is more than one response token in the turn, the position of the first token is coded.

– Initial position

This code is used for turn-initial response tokens (including instances with prefacing elements [C5] and responses which consist of a response token only, see [C9]).

– Mid position

This code is used for response tokens in turn-mid position.

– Final position

This code is used for response tokens in turn-final position.

– 0

This code is used if there is a verbal response [C2] without a response token.

– NA

This code is used if the variable verbal response [C2] is tagged as ‘no’.

3.3.9 C9: (Non)-minimal response

Is the response minimal or non-minimal, i.e. does the response consist of more than a response token or a cluster of tokens?

This category differentiates between minimal and non-minimal responses that include a response token (Gubina et al. 2024, Keevallik 2010, Steensig and Heinemann 2013, Seuren and Huiskes 2017) and other response formats. Minimal responses are type-conforming responses (Raymond 2003) that only consist of a confirming or disconfirming response token (see [C6]) or a cluster of response tokens (see [C7]). Non-minimal responses are type-conforming responses that consist of an initial confirming or disconfirming response token or a cluster of response tokens and a turn expansion by the respondent – regardless of whether there is a pause in between the response token (or the cluster of response tokens) and the expansion. The code ‘other’ refers to cases in which a response does not contain a response token (e.g., if the response is formatted as a repeat [C10]).

– Minimal response

This code is used if the response contains a response token only or a cluster of response tokens.

– Non-minimal response

This code is used if the response contains more than one response token or a cluster of response tokens.

– Other

This code applies to responses without a response token, e.g., full repeats [C10]).

– NA

This code is used if the variable verbal response [C2] is tagged as ‘no’.

3.3.10 C10: Full repeat

Does the response include a full or full-expanded repeat?

This category codes for responses that repeat the RfC in full, i.e. reproduce the original proposition without replacing or deleting elements of the original utterance (like in the German RfC sequence A: ihr seid auto gefahren, ‘you drove by car?’ B: hm_hm? wir sin auto gefahren ‘uhu, we drove by car’, Deppermann et al. 2024). Moreover, we identified full-expanded responses that repeat the RfC in full and add lexical elements (such as noun phrases, adjectives, or focus particles [as for instance in the Arabic RfC sequence A: huwwa wi mrāt-u? ‘he and his wife?’ – B: huwwa wi mrāt-u wi l-bint wi kullu; ‘he and his wife and the daughter and everyone’, Marmorstein 2024, 23]).

In identifying full and full-expanded repeats, we allowed for pronominalisation and deictic or pronominal shifts (such as in the Yurakaré RfC sequence A: nij mim, ‘you didn’t catch any [fish]?’ B: nij mii, ‘I didn’t catch any’, Gipper 2024), the omission of tags (such as in the Low German RfC sequence A: de heff KLAssentreffen hat; =ne, ‘they had class reunion, right?’ B: de heff KLAssentreffen hat, ‘they had class reunion’, Weber 2024) and the omission of connective devices, inference marking or modulation (as in the British English RfC sequence A: oh Emma’s pregnant B: Emma’s pregnant, Küttner and Szczepek Reed 2024, 30, or the German sequence A: der hat aber auch ne eigene firma; =oder? ‘he has his own company though, hasn’t he?’ B: der hat ne eigene firma. ‘he has his own company’, Deppermann et al. 2024). Moreover, we included instances with inversion (‘Has he called X? – He has called X.’) or negation (‘She called X? – She did not call X.’).

Note that the operationalisation of full repeats presented here is stricter than in other studies as we exclude partial repeats in which propositional elements are deleted (e.g., ‘Is John coming?’ – ‘John is.’ or ‘He is.’ (Enfield et al. 2019, 288; see also Stivers and Enfield 2010) due to poor results in the reliability check (see Section 4). In line with Enfield et al. (2019), we did not include transformative responses (Stivers and Hayashi 2010).

– Full repeat

This code applies to responses that fully repeat the confirmable without adding any lexical elements except for response tokens.

– Full-expanded repeat

This code applies to responses which fully repeat the confirmable but also add lexical elements as outlined above.

– Other response formats

This code applies to responses which do not include a full or a full-expanded repeat (e.g., with a response token [C6]).

– NA

This code is used if the variable verbal response [C2] is tagged as ‘no’.

3.3.11 C11: Responsive action

Which action is implemented by the response?

This category codes for the types of responsive action that follow the RfC (confirmation or disconfirmation). It also registers instances in which speakers neither confirm or disconfirm. Note that this category does not relate to particular response formats, i.e. confirmations or disconfirmations could be done with token responses and full or full-expanded repeats as well as responses which are cast in other formats (such as in the Low German RfC sequence A: aber de kUmmt doch hier ut EMSland; =ne, ‘but he is from Emsland, right?’ B: ut WERLte; ‘from Werlte (a city in the Emsland region)’, Weber 2024), including embodied responses [C3].

– Confirmation

This code is used if the response clearly confirms the RfC, e.g., with a confirming ‘yes’-response token, a repeat, or a combination of verbal and embodied resources.

– Disconfirmation

This code is used if the response clearly disconfirms the RfC, e.g., with a disconfirming ‘no’ response token, a repeat in reversed polarity, or a combination of verbal and embodied resources.

– Neither

This code is used for a range of responsive actions that neither confirm nor disconfirm clearly. Responses in this category may push against epistemic assumptions like, for instance, non-answer turns that express lack of knowledge or that perform or initiate repair (Stivers 2022, chapter 3). They may be evasive or non-committed (Gipper 2024, Kim 2024b), offer conditional confirmation (Gipper and Groß 2024), transform the terms of agreement (Stivers and Hayashi 2010) or expressions of thankfulness, wish, or hope (such as German das hoff ich ‘I hope so’, Deppermann et al. 2024, or Arabic bi-zni llāh ‘with God’s permission’, Marmorstein 2024).

– NA

This code is used if the variable response [C1] is tagged as ‘no’.

As indicated above, the project’s main goal was to identify and compare the resources used in requesting confirmation and delivering confirming or disconfirming responses across languages. Accordingly, the scheme as we present it here primarily aims at capturing the linguistic design of the two turn types in a descriptive manner. So, instead of coding whether an RfC presents inferences from prior talk, for instance, we code the occurrence of inferential markers (category [B6]). Or, we code response tokens (C6) or full repeats (C10) as features of the response design without assuming particular interactional functions at the same time (König and Pfeiffer forthcoming a provide a methodological reflection). This procedure to code form and function separately allows us to untangle in subsequent case-by-case analyses how speakers of a language interpret certain resources functionally and how they are fitted to their context of use.

4 Inter-Coder Reliability Check

Section 3 outlined the coding scheme that forms the basis of the research network’s cross-linguistic endeavour. To ensure consistent coding across the ten languages, network members jointly developed the coding categories, as well as revised and refined them in iterative data sessions. In addition to this ‘qualitative’ calibration process, previous cross-linguistic studies also recommend measuring the alignment of a shared understanding between the coders on a quantitative basis. In statistics, ICR^[6] serves as a well-established method for assessing “the rigor and transparency of the coding frame and its application to the data” (O’Connor and Joffe 2020, 3). Methodologically, ICR is “assessed by having two or more coders categorise content, and then using these categorisations to calculate a numerical index of the extent of agreement between or among the coders … Such an index is called the inter-coder reliability index” (Feng 2014, 1803). On the one hand, ICRs show the validity of a comparative study and therefore enhance the trustworthiness of the analyses being made. On the other hand, ICR is used as a tool to support the coding process itself through an ongoing disclosure of (mis)understandings between the coders.

In this study, ICR was used both as a tool measuring the validity of the study and the coding scheme above, as well as a quality assurance tool to improve the reflexivity and the emergence of a common understanding in the coding process itself. Applying the coding scheme systematically to the data, we took a subset of 30 RfC instances selected randomly from the American English sample^[7] used in the network, since English was the only language all researchers were familiar with. These RfC instances have been taken from six telephone conversations from the publicly available Talkbank CallFriend and CallHome corpora (MacWhinney 2007, retrieved from http://talkbank.org). Network members were asked to code this randomised set according to the coding scheme outlined above. The network investigates RfCs in ten different languages (the analysis of English comprises a comparison of two different varieties, British and American English, Küttner and Szczepek Reed 2024). As three researchers worked on the German language, 13 ratings were tested for ICR. For more than two coders and nominal datatypes, the literature recommends computing either Fleiss’ Kappa (K) or Krippendorf’s α as statistical measurement options for ICR. Krippendorff’s α, however, appears to be the gold standard in many studies because of its higher flexibility in dealing with missing data (Lombard et al. 2002, O’Connor and Joffe 2020). In the following overview, we will report both measurements.

We started by loading the codings into R (version 4.2.3). We isolated each variable from the 13 codings and concatenated them into new variable-based data frames. After that, we turned the characters of the nominal variables into numeric objects (e.g., we turned the coding options [positive] and [negative] of the variable polarity [B3] into a numeric ‘0–1’ string). For computing Fleiss’ Kappa as an index of inter-coder agreement, we then turned each variable-based data frame into ‘n × m’-matrices. For computing Krippendorf’s α, we took these ‘n × m’-matrices and transformed them into ‘m × n’-matrices with the t()-function. With the boot-package (version 1.3-28.1; Canty et al. 2021), we finally computed nonparametric bootstrap confidence intervals (BCI) using the adjusted bootstrap percentile (BCa) method with 1,000 replications at the 95% confidence level.

In current research, different values are described as the lowest acceptable threshold (median) for ICR values. In their cross-linguistic study on repair in interaction, Dingemanse et al. (2015, 4) considered only variables that achieved a Krippendorff’s α of ≥0.66. For Fleiss’ Kappa, the nomenclature of Landis and Koch (1977, 165) sets a threshold of K = 0.61–0.80 for a substantial strength of agreement, while K ≥ 0.81 shows an almost perfect strength of agreement between coders. As the design and theoretical background of Dingemanse et al.’s project is similar to our study, we will follow their threshold of ≥0.66 (the same applies to the Fleiss’ Kappa value, which displays a substantial strength of agreement). We conducted a first round of calculating ICR in the coding phase “to assess the robustness of the coding frame and its application” (O’Connor and Joffe 2020, 2). Variables that reached a Krippendorff’s α of ≥0.66 were kept in the coding scheme. Categories with a very low level of reliability and agreement (under ≥ 0.4) that did not achieve the aforementioned threshold were excluded from the coding process. Variables that came close to this threshold were examined for coder-specific inconsistencies. We then provided individual feedback, adapted the coding scheme, and asked the coders to re-code their language-specific sheets according to the revised understanding. After this first robustness check, the members coded the English dataset a second time. We tested once more whether a satisfactory reliability was achieved or not (see Campbell et al. 2013 and Hruschka et al. 2004 for a similar testing procedure). Again, we kept the variables in the coding scheme if they met the ICR threshold and removed them if they did not.

Figure 1 presents the final results of the ICR as box plots for the variables of the coding scheme presented above, which also form the basis of the network’s comparative article (Pfeiffer et al. forthcoming).^[8] Box plots are a useful visual representation for assessing the distribution and variability of agreement metrics such as Fleiss’ Kappa and Krippendorff’s α.^[9] The median (depicted as a circle within the box) represents the central value of the agreement scores. The interquartile range (IQR) shows the middle 50% of the agreement scores. A narrow IQR suggests that the agreement scores are consistent across samples or raters, whereas a wider IQR indicates greater variability. This variability may be due to differences in coder interpretations, item difficulty, or other contextual factors. Besides median and IQR, box plots are also interpreted according to symmetry and skewness. The symmetry of the box plot provides insights into the distribution of agreement scores. A symmetric box plot suggests a balanced distribution, while asymmetry might indicate skewness, where scores are concentrated towards the lower or higher end.

Figure 1

Fleiss’ Kappa and Krippendorf’s α (+BCI) of each variable based on 13 codings.

Figure 1 reveals different insights into the ICR of the current study. First, it shows that there are no outliers (data points outside the whiskers of the box plot). For agreement metrics, outliers might indicate specific conditions or items where coders exhibited significantly lower or higher agreement than the majority. The agreement values across the coders of this study were therefore relatively consistent, showing that all coders share a common understanding of the variables in the coding manual. In addition, the agreement values range from moderate (0.693) to very strong (0.95), which all meet the threshold value of 0.66 mentioned in the literature. Furthermore, the confidence intervals are generally symmetric, indicating balanced agreement distributions. Finally, across all plots, no significant skewness is observed, highlighting stable agreement measurements.^[10] However, we also see that the box plots differ with regard to their spread and variability. Concerning these two factors, the variables can be divided into three groups. Table 1 provides an overview.

Table 1

Grouped variables based on ICR agreement and variability values

Group	Agreement	Variability	Variables
[1] High Agreement, Low Variability	≥0.90	Minimal	Tag (0.950), Polarity (0.942), Response Tokens (0.934), Integration of Tag (0.919)
[2] Strong Agreement, Moderate Variability	0.80–0.89	Minimal to Moderate	Syntactic Complexity (0.896), Full Repeat (0.869), Final Intonation Tag (0.869/0.870), Prefacing Elements (0.858), Position of First Response Token (0.814/0.815), Final Intonation Confirmable (0.825), Inference Marking (0.811).
[3] Moderate Agreement, Moderate Variability	0.69–0.79	Moderate	Cluster of Response Tokens (0.766), Connectives (0.746), Modulation (0.733), Non-Minimal Response (0.707), and Responsive Action (0.693)

Four variables exhibit very high agreement values (≥0.90) with narrow IQRs, indicating excellent reliability and minimal variability in the codings (Group 1). These include Tag (0.950), Polarity (0.942), Response Tokens (0.934), and Integration of Tag (0.919). The consistent reliability in these measures suggests that the coders achieved near-perfect consensus.

Seven variables show strong agreement values (0.80–0.89) with minimally to moderately narrow IQRs, reflecting reliable but slightly variable agreement (Group 2). Variables in this group include Syntactic Complexity (0.896), Full Repeat (0.869), Final Intonation Tag (0.869/0.870), Prefacing Elements (0.858), Position of First Response Token (0.814/0.815), Final Intonation Confirmable (0.825), and Inference Marking (0.811). These findings indicate a solid consensus among coders with slightly more variability compared to the first group.

Finally, variables reveal moderate but still solid agreement values (0.79–0.69) with wider IQRs, suggesting higher variability. This group includes Cluster of Response Tokens (0.766), Connectives (0.746), Modulation (0.733), Non-Minimal Response (0.707), and Responsive Action (0.693). There could be several explanations for these lower median values and the higher variability in the codings of the latter group (although they still meet the median threshold of α ≥ 0.66). Some of these variables have more than two coding options (usually three or four). The literature suggests that the more coding options, the lower the ICR, because the probability of deviations increases as the number of coding options expands (e.g., Hruschka et al. 2004, Roberts et al. 2019). Moreover, compared to polarity, concepts like modulation or connectives are highly language-specific. It seems plausible that, in certain cases, the coders might have transferred the language-specific understanding they have of a coding category in their object language to the English data. Also, in the process of qualitative calibration, these variables, among others, were often the subject of qualitative discussions about the coding scheme. In sum, we achieved a substantial agreement strength for these variables in group three, however, the mean ICRs still remain lower compared to variables from groups one and two. In addition, the relatively low value of the variable Responsive action shows that a straightforward operationalisation of functional categories is potentially hard to achieve. In our case, this concerns the boundaries between responses which clearly confirm or disconfirm and responses that lie somewhere in between.^[11] The ICR of all of these variables shows that there is often more interpretation of these variables in comparison to other straightforward variables with high ICR.

In sum, the overall results indicate strong reliability for the majority of variables, with certain categories demonstrating greater variability. Notably, variables that are primarily form-based, such as Response Tokens, Polarity, and Tag, consistently exhibit higher agreement values and lower variability. These results suggest that such variables are inherently more suitable for cross-linguistic comparison, as they rely on more objective and easily identifiable features across languages (see also König and Pfeiffer forthcoming a, on the advantages of form-based coding). In contrast, more action-based variables, such as Responsive Action and Modulation, show lower but still solid agreement but also greater variability. This variability likely arises from the interpretative and context-dependent nature of these categories, which can lead to divergent coder judgments. As a result, while these variables remain important for discourse-level analyses, their application in cross-linguistic projects may require further discussions in the pragmatic-typological field.

Moreover, the present analysis highlights that relying solely on median agreement values can mask significant differences in variability across variables. Previous studies have exclusively reported median-based thresholds for agreement without addressing the variability within variables, leaving important insights into the precision and consistency of codings unexplored. For instance, even variables with similar medians may differ substantially in their IQR widths, reflecting variations in coder consensus and reliability. These differences are particularly important for cross-linguistic comparative projects, where the consistency of annotation across diverse languages and contexts is crucial.

5 Conclusion

Studies in interactional linguistics may start from particular forms and compare how they are implemented as actions (as, for instance, okay, Betz et al. 2021, or negative mental verb constructions such as I don’t know or je sais pas, Lindström et al. 2016) or they may take generic conversational structures or actions as their starting point (such as other-initiated repair, Dingemanse et al. 2014, or recruitments, Floyd et al. 2020) and study the organisation of the range of resources speakers of different language use in these contexts. In our comparative project, we followed the latter approach by taking a particular sequence as a ‘natural control’ (Dingemanse and Floyd 2014) and therefore abstracting from language-specific criteria. In sampling relevant cases, we identified turns with which speakers take a partially knowing stance towards the confirmable but assign primary epistemic rights to the addressee, making relevant a response that (minimally) deals with the confirmable’s veracity (refer to König and Pfeiffer forthcoming b for a detailed characterisation of RfCs).

In developing the coding procedures, we were continually faced with the challenge that formal features may not be applicable to all languages under study and that functional descriptions may be hard to operationalise. The scheme can therefore only offer a first glimpse into the diverse realisation of RfC sequences. It nonetheless proved to be a valuable instrument for cross-linguistic comparison, which helped us to uncover similarities and differences across languages (cf. Pfeiffer et al. forthcoming) and a useful heuristic tool to identify new questions for future intra- and interlingual research (Steensig and Heinemann 2015, König and Pfeiffer forthcoming a).

A key insight from the analysis of ICR discussed in this article is the necessity of relying not only on the median agreement values, but also on the associated variability, as indicated by the width of IQRs. Previous studies in the field have predominantly relied on a single threshold for the median agreement value to determine ICR, often overlooking the variability within individual variables. The findings in this article reveal an overarching trend. Although all variables reach the set threshold value of α ≥ 0.66, there are inter-variable differences. Variables with clear, objective definitions – particularly the form-based ones – tend to achieve higher agreement values with narrower IQRs, demonstrating strong reliability and low variability. In contrast, more interpretive, action-based variables exhibit lower agreement and wider IQRs, reflecting greater variability between the coders. Some variables, therefore, lend themselves more easily to coding social interaction than others. In conclusion, while strong median agreement values remain an important criterion for assessing reliability, cross-linguistic studies should also systematically report variability metrics. This approach provides a more comprehensive and accurate assessment of variable reliability, facilitating the selection of robust variables for comparative linguistic research and ensuring methodological transparency. Reporting and reflecting such differences in ICR ratings can therefore help to advance the methodological discussion about coding social interaction (Stivers 2015, Steensig and Heinemann 2015, see also König and Pfeiffer forthcoming a). We therefore hope that our approach may prove useful as a point of reference or departure in future cross-linguistic research of social interaction.

Acknowledgments

We would like to thank all contributors to this special issue, who have continuously helped to develop and refine the coding procedures over the course of four years. Special thanks extend to U. Küttner for providing the American English subset used in testing inter-coder reliability. Moreover, we would like to thank the following researchers who have supported and advised us at different stages of our project (K. Birkner, E. Couper-Kuhlen, M. Dingemanse, A. Groß, A. Koivisto, A. Liesenfeld, J. Lindström, M. Selting, J. Steensig, T. Stivers, R. Suzuki, J. Zinken, in alphabetical order).

Funding information: The work was financed by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG), Scientific Network “Interactional Linguistics – Discourse particles from a cross-linguistic perspective” (project number 413161127).
Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission. The coding categories were developed jointly in the research network ‘Interactional Linguistics’ (see funding information). KK, KW, and MP devised the manuscript. KW conducted the inter-coder reliability test and provided the statistical analyses in Section 4.
Conflict of interest: The authors state no conflict of interest. KK and MP are Guest Editors for the Special Issue on Request for confirmation sequences across ten languages. They were not, however, involved in the review process of this article. It was handled entirely by other Editors of the Journal.
Data availability statement: The datasets generated and/or analysed during the current study are available in the Open Science Framework (OSF) repository, https://osf.io/zmw7e/?view_only=f70ed6596c5d4b8a97bc67f124f655b6.

References

Ben-Moshe, Yotam M. and Yael Maschler. 2024. “Request for Confirmation Sequences in Hebrew.” Open Linguistics 10 (1): 20240028. 10.1515/opli-2024-0028.Search in Google Scholar

Betz, Emma and Andrea Golato. 2008. “Remembering Relevant Information and Withholding Relevant Next Actions: The German Token Achja.” Research on Language & Social Interaction 41 (1): 58–98. 10.1080/08351810701691164.Search in Google Scholar

Betz, Emma, Arnulf Deppermann, Lorenza Mondada, and Marja-Leena Sorjonen, eds. 2021. OKAY Across Languages: Toward a Comparative Approach to Its Use in Talk-in-Interaction. Amsterdam: Benjamins.10.1075/slsi.34Search in Google Scholar

Bolden, Galina B. 2010. “‘Articulating the Unsaid’ via and-Prefaced Formulations of Others’ Talk.” Discourse Studies 12 (1): 5–32.10.1177/1461445609346770Search in Google Scholar

Bolden, Galina B., John Heritage, and Marja-Leena Sorjonen, eds. 2023. Responding to Polar Questions Across Languages and Contexts. Amsterdam: Benjamins.10.1075/slsi.35Search in Google Scholar

Campbell, John L., Charles Quincy, Jordan Osserman, and Ove K. Pedersen. 2013. “Coding in-Depth Semistructured Interviews.” Sociological Methods & Research 42 (3): 294–320. 10.1177/0049124113500475.Search in Google Scholar

Canty, Angelo, Brian Ripley, and Alessandra R. Brazzale. 2021. CRAN: Contributed Packages.Search in Google Scholar

Couper-Kuhlen, Elizabeth and Margret Selting. 2018. Interactional Linguistics: Studying Language in Social Interaction. Cambridge: Cambridge University Press.10.1017/9781139507318Search in Google Scholar

De Vet, Henrica C. W., Caroline B. Terwee, Dirk L. Knol, and Lex M. Bouter. 2006. “When to Use Agreement Versus Reliability Measures.” Journal of clinical epidemiology 59 (10): 1033–39. 10.1016/j.jclinepi.2005.10.015.Search in Google Scholar

Deppermann, Arnulf, Alexandra Gubina, Katharina König, and Martin Pfeiffer. 2024. “Request for Confirmation Sequences in German.” Open Linguistics 10 (1): 20240008. 10.1515/opli-2024-0008.Search in Google Scholar

Dingemanse, Mark and Simeon Floyd. 2014. “Conversation Across Cultures.” In The Cambridge Handbook of Linguistic Anthropology, edited by N. J. Enfield, Paul Kockelman, and Jack Sidnell, 447–80. Cambridge: Cambridge University Press.10.1017/CBO9781139342872.021Search in Google Scholar

Dingemanse, Mark, Joe Blythe, and Tyko Dirksmeyer. 2014. “Formats for Other-Initiation of Repair Across Languages: An Exercise in Pragmatic Typology.” Studies in Language 38 (1): 5–43. 10.1075/sl.38.1.01din.Search in Google Scholar

Dingemanse, Mark, Kobin H. Kendrick, and Nick J. Enfield. 2016. “A Coding Scheme for Other-Initiated Repair Across Languages.” Open Linguistics 2 (1): 35–46. 10.1515/opli-2016-0002.Search in Google Scholar

Dingemanse, Mark, Seán G. Roberts, Julija Baranova, Joe Blythe, Paul Drew, Simeon Floyd, Rosa S. Gisladottir, et al. 2015. “Universal Principles in the Repair of Communication Problems.” PloS one 10 (9): e0136100. 10.1371/journal.pone.0136100.Search in Google Scholar

Dingemanse, Mark. 2020. “Between Sound and Speech: Liminal Signs in Interaction.” Research on Language & Social Interaction 53 (1): 188–96. 10.1080/08351813.2020.1712967.Search in Google Scholar

Ehmer, Oliver. 2025. “Request for Confirmation Sequences in Castilian Spanish.” Open Linguistics 10 (1), 20240039. 10.1515/opli-2024-0039.Search in Google Scholar

Enfield, Nick J., Penelope Brown, and Jan P. de Ruiter. 2012. “Epistemic Dimensions of Polar Questions: Sentence-Final Particles in Comparative Perspective.” In Questions. Formal, Functional and Interactional Perspectives, edited by Jan P. de Ruiter, 193–221. Cambridge: Cambridge University Press.10.1017/CBO9781139045414.014Search in Google Scholar

Enfield, Nick J., Tanya Stivers, Penelope Brown, Christina Englert, Kathariina Harjunpää, Makoto Hayashi, Trine Heinemann, et al. 2019. “Polar Answers.” Journal of Linguistics 55 (2): 277–304. 10.1017/S0022226718000336.Search in Google Scholar

Feng, Guangchao Charles. 2014. “Intercoder Reliability Indices: Disuse, Misuse, and Abuse.” Quality & Quantity 48 (3): 1803–15. 10.1007/s11135-013-9956-8.Search in Google Scholar

Floyd, Simeon, Giovanni Rossi, and Nick J. Enfield, eds. 2020. Getting Others to Do Things: A Pragmatic Typology of Recruitments. Berlin: Language Science Press.Search in Google Scholar

Floyd, Simeon, Giovanni Rossi, and Nick J. Enfield. 2020. “A Coding Scheme for Recruitment Sequences in Interaction.” In Getting Others to Do Things: A Pragmatic Typology of Recruitments, edited by Simeon Floyd, Giovanni Rossi, and N. J. Enfield, 25–50. Berlin: Language Science Press.Search in Google Scholar

Fox, Barbara, Sandra A. Thompson, Cecilia E. Ford, and Elizabeth Couper-Kuhlen. 2013. “Conversation Analysis and Linguistics.” In Sidnell and Stivers, Vol. 2013, 726–40.10.1002/9781118325001.ch36Search in Google Scholar

Gipper, Sonja and Alexandra Groß. 2024. “Less Than Confirming, and Doing More Than That: Comparing Responses to Requests for Confirmation in German and Yurakaré.” Contrastive Pragmatics 5 (1–2): 347–91.10.1163/26660393-bja10100Search in Google Scholar

Gipper, Sonja. 2024. “Request for Confirmation Sequences in Yurakaré.” Open Linguistics 10 (1): 20240026. 10.1515/opli-2024-0026.Search in Google Scholar

Golato, Andrea and Zsuzsanna Fagyal. 2008. “Comparing Single and Double Sayings of the German Response Token Ja and the Role of Prosody: A Conversation Analytic Perspective.” Research on Language and Social Interaction 41 (3): 241–70.10.1080/08351810802237834Search in Google Scholar

Gubina, Alexandra, Emma Betz, and Arnulf Deppermann. 2024. “Doing More Than Confirming: Expanded Responses to Requests for Confirmation in German Talk-in-Interaction.” Contrastive Pragmatics 5 (1–2): 307–46.10.1163/26660393-bja10114Search in Google Scholar

Hancil, Sylvie, Margje Post, and Alexander Haselow. 2015. “Introduction: Final Particles from a Typological Perspective.” In Final Particles, edited by Sylvie Hancil, Alexander Haselow, and Margje Post. Berlin: de Gruyter Mouton.10.1515/9783110375572Search in Google Scholar

Haspelmath, Martin. 2010. “Comparative Concepts and Descriptive Categories in Crosslinguistic Studies.” Language 86 (3): 663–87.10.1353/lan.2010.0021Search in Google Scholar

Hentschel, Elke. 1996. “Negation in Interrogation und Exklamation.” In Deutsch - Typologisch, edited by Ewald Lang and Gisela Zifonun, 218–26. Berlin, New York: de Gruyter.10.1515/9783110622522-011Search in Google Scholar

Hepburn, Alexa and Galina B. Bolden. 2013. “The Conversation Analytic Approach to Transcription.” In Sidnell and Stivers, Vol. 2013, Wiley.10.1002/9781118325001.ch4Search in Google Scholar

Heritage, John and Chase Wesley Raymond. 2021. “Preference and Polarity: Epistemic Stance in Question Design.” Research on Language and Social Interaction 54 (1): 39–59. 10.1080/08351813.2020.1864155.Search in Google Scholar

Heritage, John and D. Rodney Watson. 1979. “Formulations as Conversational Objects.” In Everyday Language. Studies in Ethnomethodology, edited by George Psathas, 123–62. New York: Irvington Publishers.Search in Google Scholar

Heritage, John. 2002. “The Limits of Questioning: Negative Interrogatives and Hostile Question Content.” Journal of Pragmatics 34: 1427–46.10.1016/S0378-2166(02)00072-3Search in Google Scholar

Heritage, John. 2012. “Epistemics in Action: Actions Formation and Territories of Knowledge.” Research on Language and Social Interaction 45 (1): 1–29.10.1080/08351813.2012.646684Search in Google Scholar

Hruschka, Daniel J., Deborah Schwartz, Daphne Cobb St. John, Erin Picone-Decaro, Richard A. Jenkins, and James W. Carey. 2004. “Reliability in Coding Open-Ended Data: Lessons Learned from HIV Behavioral Research.” Field Methods 16 (3): 307–31. 10.1177/1525822X04266540.Search in Google Scholar

Keevallik, Leelo. 2010. “Minimal Answers to Yes/No Questions in the Service of Sequence Organization.” Discourse Studies 12 (3): 283–309. 10.1177/1461445610363951.Search in Google Scholar

Kim, Kyu-hyun. 2024a. “Negatively-Formatted Requests for Confirmation in Korean Conversation: Three Types of Verbal Negation as Interactional Resources.” Contrastive Pragmatics 5 (1–2): 72–121. 10.1163/26660393-bja10079.Search in Google Scholar

Kim, Kyu-hyun. 2024b. “Request for Confirmation Sequences in Korean.” Open Linguistics 10 (1): 20240010. 10.1515/opli-2024-0010.Search in Google Scholar

Koivisto, Aino. 2012. “Discourse Patterns for Turn-Final Conjunctions.” Journal of Pragmatics 44 (10): 1254–72. 10.1016/j.pragma.2012.05.006.Search in Google Scholar

König, Katharina and Martin Pfeiffer. 2024. “Requesting Confirmation or Reconfirmation Across Languages: An Introduction.” Contrastive Pragmatics 5 (1–2): 1–26. https://www.sciencedirect.com/org/journal/contrastive-pragmatics/vol/5/issue/1.10.1163/26660393-00001063Search in Google Scholar

König, Katharina and Martin Pfeiffer. Forthcoming a. “Coding Request for Confirmation Sequences: Methodological Reflections from a Cross-Linguistic Research Project.” Research on Language and Social Interaction.Search in Google Scholar

König, Katharina and Martin Pfeiffer. Forthcoming b. “Request for Confirmation Sequences in Ten Languages: An Introduction.” Open Linguistics.Search in Google Scholar

Küttner, Uwe-A. and Beatrice Szczepek Reed. 2024. “Request for Confirmation Sequences in British and American English.” Open Linguistics 10 (1): 20240012. 10.1515/opli-2024-0012.Search in Google Scholar

Küttner, Uwe-A., Laurenz Kornfeld, and Jörg Zinken. 2023. “A Coding Scheme for (Dis)Approval-Relevant Events Involving the Direct Social Sanctioning of Problematic Behavior in Informal Social Interaction.” IDSopen 5. 10.21248/idsopen.5.2023.8.Search in Google Scholar

Küttner, Uwe-A., Laurenz Kornfeld, Christina Mack, Lorenza Mondada, Jowita Rogowska, Giovanni Rossi, Marja-Leena Sorjonen, Matylda Weidner, and Jörg Zinken. 2024. “Introducing the ‘Parallel European Corpus of Informal Interaction’ (PECII): A Novel Resource for Exploring Cross-Situational and Cross-Linguistic Variability in Social Interaction.” In New Perspectives in Interactional Linguistic Research, edited by Margret Selting and Dagmar Barth-Weingarten, 132–60. Amsterdam: Benjamins.10.1075/slsi.36.05kutSearch in Google Scholar

Landis, J. Richard and Gary G. Koch. 1977. “The Measurement of Observer Agreement for Categorical Data.” Biometrics 33 (1): 159. 10.2307/2529310.Search in Google Scholar

Langer, Nils. 2003. “Low German.” In Germanic Standardizations. Past to Present, edited by Ana Deumert and Wim Vandenbussche, 281–301. Amsterdam: John Benjamins.10.1075/impact.18.11lanSearch in Google Scholar

Li, Xiaoting. 2024. “Request for Confirmation Sequences in Mandarin Chinese.” Open Linguistics 10 (1): 20240011. 10.1515/opli-2024-0011.Search in Google Scholar

Lindström, Jan, Yael Maschler, and Simona Pekarek Doehler. 2016. “A Cross-Linguistic Perspective on Grammar and Negative Epistemics in Talk-in-Interaction.” Journal of Pragmatics 106: 72–79. 10.1016/j.pragma.2016.09.003.Search in Google Scholar

Lombard, Matthew, Jennifer Snyder-Duch, and Cheryl Campanella Bracken. 2002. “Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability.” Human Comm Res 28 (4): 587–604. 10.1111/j.1468-2958.2002.tb00826.x.Search in Google Scholar

MacWhinney, Brian. 2007. “The Talkbank Project.” In Creating and Digitizing Language Corpora, edited by Joan C. Beal, Karen P. Corrigan, and Hermann L. Moisl, 163–80. London: Palgrave Macmillan UK.10.1057/9780230223936_7Search in Google Scholar

Marmorstein, Michal. 2024. “Request for Confirmation Sequences in Egyptian Arabic.” Open Linguistics 10 (1): 20240009. 10.1515/opli-2024-0009.Search in Google Scholar

Mulder, Jean and Sandra A. Thompson. 2008. “The Grammaticization of but as a Final Particle in English Conversation.” In Crosslinguistic Studies of Clause Combining: The Multifunctionality of Conjunctions, edited by Ritva Laury, 179–204. Amsterdam: Benjamins.10.1075/tsl.80.09mulSearch in Google Scholar

O’Connor, Cliodhna and Helene Joffe. 2020. “Intercoder Reliability in Qualitative Research: Debates and Practical Guidelines.” International Journal of Qualitative Methods 19: 160940691989922. 10.1177/1609406919899220.Search in Google Scholar

Oloff, Florence. 2024. “Request for Confirmation Sequences in Czech.” Open Linguistics 10 (1): 20240034. 10.1515/opli-2024-0034.Search in Google Scholar

Pfeiffer, Martin, Katharina König, Kathrin Weber, Arnulf Deppermann, Oliver Ehmer, Sonja Gipper, Alexandra Gubina, et al. Forthcoming. “Request for Confirmation Sequences in Ten Languages: A Quantitative Comparison.” Open Linguistics.Search in Google Scholar

Raymond, Chase Wesley, and Tanya Stivers. 2016. “The Omnirelevance of Accountability: Off-Record Account Solicitations.” In Accountability in Social Interaction, edited by Jeffrey D. Robinson, 321–53. New York: Oxford University Press.10.1093/acprof:oso/9780190210557.003.0011Search in Google Scholar

Raymond, Geoffrey. 2003. “Grammar and Social Organization: Yes/No Interrogatives and the Structure of Responding.” American Sociological Review 68 (6): 939–67.10.1177/000312240306800607Search in Google Scholar

Raymond, Geoffrey. 2013. “At the Intersection of Turn and Sequence Organization: On the Relevance of ‘Slots’ in Type-Conforming Responses to Polar Interrogatives.” In Szczepek Reed and Raymond, Vol. 2013, 169–206. Amsterdam: Benjamins.10.1075/slsi.25.06raySearch in Google Scholar

Roberts, Kate, Anthony Dowell, and Jing-Bao Nie. 2019. “Attempting Rigour and Replicability in Thematic Analysis of Qualitative Research Data; a Case Study of Codebook Development.” BMC Medical Research Methodology 19 (1): 66. 10.1186/s12874-019-0707-y.Search in Google Scholar

Sadock, Jerry and Arnold Zwicky. 1985. “Speech Act Distinction in Syntax.” In Language Typology and Syntactic Description: Grammatical Categories and the Lexicon, edited by Timothy Shopen, 155–76. Cambridge: Cambridge University Press.Search in Google Scholar

Schegloff, Emanuel. 2006. “Interaction: The Infrastructure for Social Institutions, the Natural Ecological Niche for Language, and the Arena in Which Culture Is Enacted.” In Roots of Human Sociality. Culture, Cognition and Interaction, edited by N. J. Enfield and Stephen C. Levinson, 70–96. Oxford, New York: Berg.10.4324/9781003135517-4Search in Google Scholar

Seuren, Lucas M. and Mike Huiskes. 2017. “Confirmation or Elaboration: What Do Yes/No Declaratives Want?” Research on Language & Social Interaction 50 (2): 188–205. 10.1080/08351813.2017.1301307.Search in Google Scholar

Sidnell, Jack and Nick J. Enfield. 2012. “Language Diversity and Social Action.” Current Anthropology 53 (3): 302–33. 10.1086/665697.Search in Google Scholar

Sidnell, Jack and Tanya Stivers, eds. 2013. The Handbook of Conversation Analysis. Chichester: Blackwell.10.1002/9781118325001Search in Google Scholar

Steensig, Jakob and Trine Heinemann. 2013. “When ‘Yes’ Is Not Enough – as an Answer to a Yes/no Question.” In Szczepek Reed and Raymond, Vol. 2013, 207–42. Amsterdam: Benjamins.10.1075/slsi.25.07steSearch in Google Scholar

Steensig, Jakob and Trine Heinemann. 2015. “Opening up Codings?” Research on Language and Social Interaction 48 (1): 20–25. 10.1080/08351813.2015.993838.Search in Google Scholar

Stivers, Tanya and Federico Rossano. 2010. “Mobilizing Response.” Research on Language and Social Interaction 43 (1): 3–31.10.1080/08351810903471258Search in Google Scholar

Stivers, Tanya and Makoto Hayashi. 2010. “Transformative Answers: One Way to Resist a Question’s Constraints.” Language in Society 39 (1): 1–25. 10.1017/S0047404509990637.Search in Google Scholar

Stivers, Tanya and Nick J. Enfield. 2010. “A Coding Scheme for Question–response Sequences in Conversation.” Journal of Pragmatics 42 (10): 2620–26. 10.1016/j.pragma.2010.04.002.Search in Google Scholar

Stivers, Tanya, Nick J. Enfield, and Stephen C. Levinson. 2010. “Question–response Sequences in Conversation Across Ten Languages. Special Issue.” Journal of Pragmatics 42 (10).10.1016/j.pragma.2010.04.002Search in Google Scholar

Stivers, Tanya. 2004. “‘No No No’ and Other Types of Multiple Sayings in Social Interaction.” Human Communication Research 30 (2): 260–93.10.1093/hcr/30.2.260Search in Google Scholar

Stivers, Tanya. 2015. “Coding Social Interaction: A Heretical Approach in Conversation Analysis?” Research on Language and Social Interaction 48 (1): 1–19. 10.1080/08351813.2015.993837.Search in Google Scholar

Stivers, Tanya. 2019. “How We Manage Social Relationships Through Answers to Questions: The Case of Interjections.” Discourse Processes 56 (3): 191–209. 10.1080/0163853X.2018.1441214.Search in Google Scholar

Stivers, Tanya. 2022. The Book of Answers: Alignment, Autonomy, and Affiliation in Social Interaction. New York, Oxford: Oxford University Press.10.1093/oso/9780197563892.001.0001Search in Google Scholar

Szczepek Reed, Beatrice and Geoffrey Raymond, eds. 2013. Units of Talk - Units of Action. Amsterdam: Benjamins.10.1075/slsi.25Search in Google Scholar

Weber, Kathrin. 2024. “Request for Confirmation Sequences in Low German.” Open Linguistics 10 (1): 20240019. 10.1515/opli-2024-0019.Search in Google Scholar

Zinken, Jörg and Uwe-A. Küttner. 2022. “Offering an Interpretation of Prior Talk in Everyday Interaction: A Semantic Map Approach.” Discourse Processes 59 (4): 1–28. 10.1080/0163853X.2022.2028088.Search in Google Scholar

Received: 2024-09-23

Revised: 2025-01-10

Accepted: 2025-02-05

Published Online: 2025-07-09

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/opli-2025-0056

Keywords for this article

coding; inter-coder reliability; comparative interactional linguistics; requests for confirmation; response design; polar questions; epistemics; pragmatic typology

Creative Commons

BY 4.0