Abstract
In this contribution, we investigate the transfer of embodied procedural knowledge in two cello master classes, zooming in on what we identify as speech-embedded nonverbal depictions — cases where meaning is communicated nonverbally, iconically, and without temporally co-occurring speech — an overlooked domain in the literature foregrounded by a critical reconceptualization of Clark’s (2016) framework of depicting. Examining such depictions in the cello classes, the curious pattern of multimodal iteration emerges, where the “same” meaning is communicated multiple times, but in multiple different combinations of modality and signaling method, and with different aspects of the meaning profiled. A brief discussion of such cases in relation to dialogic syntax then underlines the relevance of semiotic properties and dialogic resonance, revealing the rich communicative affordances of multimodal iteration in contexts of instruction.
1 Depicting in transfer of embodied knowledge
Talking about embodied procedural knowledge is possible, through the packaging of information into linguistic categories (Chafe 1977a, 1977b; Croft 2007). Fitting the real world into discrete categories can be notoriously challenging, however. No matter how hard we try to describe how to tie shoelaces, such a description doesn’t capture the nuances of the actions of the two hands interacting with artifacts in three-dimensional space. Likewise, a cello instructor can try their best to approximate with words how a certain musical phrase should be accentuated, but a lot would still be “lost in translation,” compared to the actual air vibration resulting from the coordinated movement of the cello player’s body in relation to the instrument. It is hardly possible to adequately, let alone perfectly, describe embodied knowledge — i.e. gradient, procedural knowledge that draws on perceptual and proprioceptive experiences — with discrete linguistic elements. Although language is not exclusively organized by discrete categorical elements, it is primarily structured around symbolic form-meaning pairings. How, then, is transfer of embodied knowledge possible?
In many cases, we rely on an instructor to show us embodied knowledge, by creating scenes, or enacting movements, that bear perceptual resemblance to the task in question — parents show their kids how to tie shoelaces by actually tying shoelaces; cello teachers show their students how to play certain phrases by actually playing those phrases. While linguists have looked into how such showing is done in combination with speech, it is not uncommon for such showing to be executed without the use of words. How embodied knowledge is transferred — iconically and nonverbally — in those cases, remains to be investigated.
To delve into this phenomenon, we turned to the framework of language use proposed by Clark (1996). Elaborating on Peirce’s (1932) semiotic triangle of icons, indices, and symbols, Clark identifies three ways in which meaning is signaled: depicting, indicating, and describing (as). In depicting, we communicate meaning by creating physical analogs; in indicating, we relate ourselves to the world; in describing, we employ signals imposed on meanings by convention. In two recent (2016], [2019) papers, Clark further elaborates on depictions, defining them as iconic physical scenes which people create and display, using sets of actions, to facilitate the addressee’s imagination, and therefore understanding, of the depicted scenes. This is illustrated by the following example, where a professional cellist comments on the student’s phrasing of a certain passage (see Supplementary Video 1).[1]
“We’re at this beautiful brook, this water flowing, and suddenly, we had a macho man coming, [plays cello in a highly masculine way].”[2] |
In addition to communicating the sense of abrupt masculinity with a description that is the verbal clause we had a macho man coming, the instructor also stages a nonverbal[3] depiction, one that is an exaggerated version of the student’s overly macho rendering of the musical phrase in question, by coordinating bodily actions in relation to the cello and cello bow, producing sounds iconic to the effect of a macho man coming. With the descriptive verbal clause, the student understands the instructor’s comment through arbitrary form-meaning relations. The nonverbal depiction allows the student to actually see the instructor’s motoric details, and hear the phrasing the instructor deems suboptimal.
Ubiquitous in language use, Clark’s (2016) notion of depicting unites linguistic phenomena where the form bears physical resemblance to the meaning, such as quotation, demonstration, enactment, constructed action, pantomime, iconic gesture, mimesis, ideophone, facial gesture, and depicting construction (e.g. Chovil 1991; Cienki and Müller 2008; Clark and Gerrig 1990; Cormier et al. 2012, 2016; Dingemanse 2013; Kendon 2004; Kita 1997; Liddell 2003; Mandel 1977; McNeill 1992; Taylor 2007; Vandelanotte 2009; Wade and Clark 1993; Zlatev 2005), on top of depiction as previously identified by other scholars, most notably Müller (2014) and Streeck (2008, 2009). Notably, many of these studies have touched upon transfer of knowledge (e.g. Gärdenfors 2017; Gullberg 1998; Stukenbrock 2012; see also Ehmer and Brône, this issue). Building on insights from existing literature, Clark’s (2016) framework of depicting affords a schematic vantage point from which to examine and compare diverse iconic phenomena in language use.
It is for this reason that we opted to examine how embodied knowledge is transferred in contexts of instruction through the lens of depicting as defined by Clark (2016). In the following, we start by reconceptualizing a typology of depictions proposed by Clark, where we identify an imbalance in the literature, namely cases where meaning is communicated iconically, nonverbally, and without simultaneously co-occurring speech, cases of which we delimit as “speech-embedded nonverbal depictions.” Examining such depictions in video recordings of two cello master classes, we zoom in on the phenomenon of “multimodal iteration,” where meaning is iterated multiple times, in multiple different combinations of modality and signaling method. The findings are then discussed in relation to semiotic properties and dialogic syntax, and ultimately to iconic language use in instructional contexts in general.
2 Speech-embedded nonverbal depictions
In his account of depicting, Clark (2016: 325–326, 2019: 237) puts forward a typology of four kinds of depictions: An adjunct depiction is timed to accompany, and in that way modify, a stretch of temporally co-occurring speech; an indexed depiction is connected to a verbal utterance through verbal deixis such as this; an embedded depiction is embedded in speech, much like a conventional verbal constituent; an independent depiction stands alone, making a separate contribution to the discourse. The following, manipulated variations of (1) illustrate the four depiction types. In line with Clark’s (2016) notation, actions co-occurring with speech are in parentheses, their co-occurring speech underlined. Square brackets indicate the actions therein do not co-occur with speech.
Adjunct: “… we had (plays cello in a masculine way) a macho man coming.”
Indexed: “… we had a macho man coming, like (plays cello in a masculine way) this.”
Embedded: “… we had [plays cello in a masculine way].”
Independent: Student: “How did I play it?” Instructor: “[plays cello in a masculine way].”
While the typology is not without problems when confronted with empirical data (for a detailed discussion on the issues of underspecification and form-function conflation, see Hsu et al. 2021), when considered in terms of information contribution, it does offer an alternative perspective from which to review existing literature on relevant phenomena. This is visualized, heuristically, in Figure 1.

Continuum of information contribution from non-depictive speech and depictive signals.
On the left side of the continuum are cases where relatively more information is communicated through non-depictive speech (i.e. descriptive and indicative speech) than depictive signals (e.g. depictive manual gesture, depictive bodily movement, depictive speech). Here we find cases of adjunct depictions, where the depiction adds complementary information to the descriptive speech it is adjunct to, as well as instances of indexed depictions, where the depiction is connected, via some verbal indexical device, with speech that is otherwise semantically incomplete. Often accompanied by simultaneously co-occurring speech, depictions on this half of the continuum largely fall within the scope of research on co-speech gesture.
On the other side of the continuum, relatively more information comes from depictive signals, compared to information from non-depictive speech. Embedded and independent depictions are located on this side of the continuum. Bounded by their embedding speech, embedded depictions take up slots that would otherwise be filled by canonical verbal constituents, communicating meaning without temporally co-occurring speech. Likewise, independent depictions contribute information to the discourse without simultaneous speech, except they stand alone, unbounded by speech. Contrasting the other half of the continuum, depictions on this side of the continuum are employed without temporally co-occurring speech. The depiction-speech relation is sequential rather than simultaneous.
Importantly, this reconceptualization of Clark’s (2016) typology brings to the fore an imbalance in the literature: Compared to depictions with simultaneous speech, depictions without co-occurring speech have not received equal attention in the linguistics literature. While subsets of such depictions have been identified by some researchers (e.g. as “speech-linked/-framed gesture” [Kendon 1988; McNeill 1992, 2005]; see also “integration by positioning” [Fricke 2012, cited in Müller et al. 2013: 65]), full-fledged investigations have yet to be carried out. Gesture studies and research on multimodal communication have in recent years seen attempts at incorporating gesture into a unified framework of language use (e.g. Bavelas and Chovil 2000; Enfield 2009; Fricke 2012; Harrison 2018; Kok and Cienki 2016), but the focus remains largely on gesture-speech co-occurrence, despite the recent debate on the status of nonverbal semiotic signals in language (see Zima and Bergs 2017).
Notable exceptions are the studies conducted by Ladewig and Keevallik. In her study, Ladewig (2020) looks at verbal utterances with an empty slot at the utterance-final position. With experiments, she demonstrates that the slot can be filled by manual gestures, and that, in such cases, speech and gesture can be integrated both syntactically and semantically. Keevallik (2010, 2015, 2017, 2018, 2020), on the other hand, observes interaction in dance instruction in a series of studies, delving specifically into cases of “bodily quoting.” With empirical data, she shows how the dance instructor “quotes” the student’s movement, in order to highlight the contrast between correct and incorrect movements. While these studies provide us with a good idea of the complexity of certain subsets of relevant phenomena, the gap in the literature still looms large.
Guided by the question how embodied knowledge is transferred through iconically motivated nonverbal signals without simultaneously co-occurring speech, we in the present study zoom in on this overlooked domain in the literature. Taking depicting — in the sense defined by Clark (2016) — as the starting point, we zero in on “speech-embedded nonverbal depictions.” Specifically, we define such depictions as those that are embedded in speech, but that are not depictions of non-depictive speech. This definition excludes canonical quotations but includes depictive speech such as ideophones and vocalizations, allowing us to focus on phenomena that have been marginalized (see Dingemanse 2017) in the literature, and therefore to investigate transfer of embodied knowledge in a new light. Equally crucially, we adopt a form-based sense of embedding, in terms of temporal overlap. This criterion cuts through Clark’s original typology, incorporating embedding across grammatical levels, from the word level to the level of the sequential organization of the discourse.[4]
3 Data and methods
A corpus was constructed for the purpose of systematically exploring speech-embedded nonverbal depictions in contexts of instruction, based on video recordings of two cello master classes from the series Steven Isserlis at the International Musicians’ Seminar, Prussia Cove, both approximately 75 min in duration. The two classes are taught by the same instructor, Steven Isserlis, an internationally renowned cellist. In each of the classes, he instructs one student on one piece. In the first class (Masterclass Media Foundation 2007), the student, male, receives instruction on Sergei Rachmaninov’s Cello Sonata in g, Op. 19; in the second (Masterclass Media Foundation 2008), he gives instructions on a female student’s interpretation of Robert Schumann’s Fantasiestücke, Op. 73. In addition to the instructor and the student, also present in the video recordings are a pianist and a small audience of around five people.
These classes are chosen for a number of reasons. First and foremost, the instructional and musical nature of the classes calls for transfer of embodied procedural knowledge that poses challenges to verbalization: As classes where the instructor teaches cello playing, which concerns primarily nonverbal procedural knowledge, they exemplify how embodied knowledge is communicated, negotiated online, and transferred in real-life interaction. Released commercially, the quality of the videos is excellent. The view alternates between focus on the instructor, focus on the student, and general view of the room. While not comparable to a controlled experimental setting where all relevant views are always accessible, this drawback is compensated by the higher level of spontaneity in the video recordings. In most cases, master classes on music playing are themselves performances, where the instructor showcases their ability of communicating ideas regarding the piece in question, to the student, but also to the audience. Given the instructor is usually an established, well-known musician, master classes are very often video recorded and released commercially. It can therefore be assumed that the student of a master class is fully aware of the performance aspect of master classes, as well as the potential presence of cameras. All these points considered, the data chosen for the present study actually approximate spontaneous language use in this very context. Indeed, recent years have seen a growing body of studies tapping into data of music master classes (e.g. Reed, this issue; Sambre and Feyaerts 2017; Szczepek Reed, this issue).
The video recordings were imported into ELAN,[5] where 219 speech-embedded nonverbal depictions were identified and annotated. For segmentation, we operationalized the “modality-agnostic (see Dingemanse 2019) gesture phrase” as the basic unit of depiction, schematized from Kendon’s (2004) gesture phrase, encompassing all nonverbal communicative signals that are considered intentionally meaningful (see Mondada 2019). Accordingly, a unit of depiction comprises as its core a modality-agnostic stroke of action (which can be manual gesture, vocalization, facial expression, bodily movement, a combination of these, etc.), with its start marked by the onset of its preparation phase, and its end followed by either a complete rest or another modality-agnostic gesture phrase. In annotation, formal and functional properties were kept separate to avoid conflation. Given the diversity of the tokens, the annotation of formal properties was done by describing the stroke phase of the depiction, informed by Bressem’s (2013) form-based annotation. Sequentially adjacent speech, as well as functional properties such as depiction type and level of embedding, was also coded.
Given the nature of the data — specifically that the classes were videotaped primarily to capture the instructor’s teaching and not the students’ reaction — for the present study, we focus on the production side of transfer of embodied knowledge, examining how depictive actions are deployed and coordinated by the instructor for the purpose of communicating to the students ideas relevant to the playing of the cello pieces.
4 Multimodal iteration
Identified on all grammatical levels of embedding, the tokens of speech-embedded nonverbal depictions exhibit diverse and complex behaviors, among which a curious pattern emerges where the meaning of an embedded depiction is also communicated through sequentially adjacent speech. Consider the utterance and depiction in (1), repeated below.
“We’re at this beautiful brook, this water flowing, and suddenly, we had a macho man coming, [plays cello in a highly masculine way].” |
Having first uttered we had a macho man coming, the instructor stages a depiction of an exaggerated version of the student’s playing of the musical phrase in question, that is in an overly masculine way. The instructor’s opinion is iterated through descriptive words, and then again through a nonverbal depiction, both of which embedded syntactically as clause-level appositions. Essentially, the “same” meaning is iterated multiple times, in parallel syntactic structures, but in multiple different modalities. Tokens like this are what we identify as cases of multimodal iteration. Notably, the term same — employed in a figurative way and therefore in quotes — does not imply the meanings of the iterations are identical. Rather, it captures the fact that each individual iteration profiles a different aspect of the same composite meaning, identifiable on a more schematic level, that the speaker wishes to communicate (see Section 5).
The significance of multimodal iteration is manifold. To begin with, it cannot be dismissed as instantiations of word search (see Goodwin and Goodwin 1986). In (2), the “right words” actually precede the “gesture.” The instructor also shows no hesitation in the utterance — in fact, on the contrary, the preparation phase of the depiction is already evident at macho, ensuring the stroke is timed to coincide with the temporal gap in speech (see Condon 1971; Kendon 2004: 135).
Instead, multimodal iteration highlights some of the semiotic properties peculiar to verbal and nonverbal modalities. As identified by McNeill (2005), speech is prototypically segmented and analytic, gesture global and synthetic: With categorical distinctions encoded in grammar, speech is usually precise, but therefore less flexible, in terms of meaning profiling, whereas gesture, with lower levels of conventionalization, is generally more fluid with regard to semantic foregrounding. Observing what she identifies as multimodal noun phrases, Fricke (2013: 748) also remarks that, the “division of labor [between speech and gesture] is a very frequent pattern […] due to the particular medial capacity of both modes” (see also Mittelberg 2014 on “mediality effects”). Though neither absolute nor categorical, the differences between speech and gesture in terms of semiotic properties cannot be dismissed.
By iterating meaning through both speech and gesture, the speaker is able to capitalize on the respective semiotic properties of both modalities, foregrounding all relevant aspects of meaning without sacrificing the rest, in effect bypassing the formal semiotic constraints of individual modalities, thereby efficiently communicating a fuller picture to the addressee (cf. Johnston 1996 on the “spiral” manner in which signing can unfold). On a theoretical level, tokens of multimodal iteration exemplify the growth point (McNeill 2005, 2013), defined as an initial pulse of thinking for/while speaking that is a minimal mental package combining, irreducibly, both linguistic categorical and imagistic components. Through multimodal iteration, the speaker is able to communicate both of these components in full, without having to “translate” meanings across modalities.
The complexity of multimodal iteration does not stop here. Consider (3), an extended excerpt of (2) (see Supplementary Video 2).
“We’re at this beautiful brook, this water flowing, and suddenly, we had a macho man coming, [plays cello in a highly masculine way], ‘hey baby, this is a wonderful brook.’” |
After the first (verbal description) and second (nonverbal depiction) iterations, the instructor illustrates the same idea with a third iteration, which is a verbal depiction, embedded on the same syntactic level as the preceding iterations, of some imaginary utterance from some imaginary macho man in the imaginary world created online by the instructor (cf. Ehmer 2011 on joint imagination as an interactive process involving a complex mental space network). Through the three iterations, the instructor illustrates the problem in the student’s playing in three ways, each profiling different aspects of the meaning.
Crucially, although the third iteration is also verbal like the first, the signaling method is different (depicting instead of describing). Accordingly, multimodal iteration can be more precisely defined as cases where meaning is iterated multiple times, not just in different modalities, but in different combinations of modality and signaling method.
Consider also the following (see Supplementary Video 3).
“Your color lost interest, you had, [plays cello absentmindedly, with an abrupt diminuendo], ‘oh dear, I haven’t got the melody anymore,’ [yawns].” |
Here the instructor again iterates what he makes of the student’s playing, three times, and in three different combinations of modality and signaling method: a nonverbal (bodily gesture in interaction with cello) depiction, a verbal depiction, and a nonverbal (manual gesture and facial expression) depiction. Notably, none of the iterations is a verbal description.
5 Resonance across iterations: similarity and contrast
Having demonstrated how the instructor can take advantage of the communicative affordances of multimodal iteration, for the purpose of transferring embodied procedural knowledge in particular, a natural follow-up question arises how the iterations can, seemingly effortlessly, be understood as communicating different aspects of the “same” schematic-level meaning. In examining the data in our corpus, we observed that, in addition to affording the possibility of exploiting the semiotic properties peculiar to different modalities, cases of multimodal iteration also exhibit striking parallels to mechanisms core to dialogic syntax. In this section, we explore the relevance of some of these mechanisms to the underlying workings of multimodal iteration.
Identifying structural mappings between sequentially aligned utterances across levels of abstraction in spontaneous language use, Du Bois (2014: 372) demonstrates how the dialogic resonance — defined as the “catalytic activation of affinities across utterances” — resulting from such parallelisms is utilized by the speaker to draw inferences from, and actively engage with, prior discourse. This engagement in turn facilitates the speaker in achieving their communicative goal, be it to concur, contest, or otherwise. Importantly, dialogic resonance is not limited to parallelisms in descriptive speech, nor to cross-turn productions; resonance can arise across modalities and signaling methods, and out of the utterances of a single speaker (Du Bois 2014).
Re-examining (3) in a simplified diagraph (Du Bois 2014), an analytical tool that foregrounds mappings across utterances, a similar pattern surfaces.
… and suddenly, | we had a macho man coming, |
[plays cello in a highly masculine way], | |
‘hey baby, this is a wonderful brook.’ |
The mappings across the three sequentially and consecutively juxtaposed iterations — in descriptive speech, depictive cello playing, and depictive speech — that are embedded on the same grammatical level in the discourse (that of a verbal clause), bring the underlying parallelisms to the fore, generating dialogic resonance. Despite the lack of any explicit formal devices (such as identical speech frames) guiding the process, the resonance catalyzes and foregrounds the activation of the perceived analogical affinities across the iterations, among which the fact that they illustrate the same issue in the student’s playing. Because of the emergent resonance, the instructor accomplishes his communicative and pedagogical goals efficiently and effectively, simply by employing the three iterations as parallel structures: Without additional explanation, he is able to make it clear that the three iterations are about different aspects of the same idea, and to invite the student to observe the relations across the iterations, thereby having a fuller understanding of the issue in her own playing.[6]
The example in (4) can likewise be considered this way.
… you had, | [plays cello absentmindedly, with an abrupt diminuendo], |
‘oh dear, I haven’t got the melody anymore,’ | |
[yawns]. |
Due to the possibility of you had serving as a quotative, it is unclear what the level of embedding is. It is sufficient, however, for cross-iteration mappings to be activated, as the iterations are clearly embedded on the same level. Paired by the structural mapping, the three iterations — a nonverbal depiction on the cello, a verbal depiction, and a nonverbal depiction utilizing primarily the hand — exhibit parallelisms, resulting in dialogic resonance that facilitates the recognition of affinities across the iterations. Once again, without any explicit formal prompt, the structural parallel alone allows the instructor to illustrate the problem in the student’s playing in three different ways, each time highlighting different aspects of the problem. The resonance likewise prompts the student to recognize the common thread across the iterations, observe the problem from three angles, and ultimately understand in what ways her playing is not satisfactory.
In the same way the notion of depicting affords an alternative vantage point from which to consider iconic language use, the framework of dialogic syntax allows us to examine our tokens on a more schematic level, leading to the discovery of other phenomena in the master classes that also hinge on parallelism. Among them are cases where individual iterations are juxtaposed in sequentiality, to mark not their similarity, but contrast. Consider the parallel iterations in (7), where the instructor demonstrates, on the cello, the quality of the clarinet, for which the piece was originally written (see Supplementary Video 4).
The clarinet | doesn’t | go | [plays cello in a “non-clarinet-like” fashion], |
it | goes | [plays cello in a “clarinet-like” fashion]. |
This excerpt exhibits very similar features to the examples above, with two nonverbal iterations embedded on the same level of the discourse. In fact, with the explicit speech frames and the identical literal musical notes, the parallelisms are even more manifest than in the previous examples. The only difference, as marked by the negation (n’t), is that the iterations communicate not two different aspects of the same meaning, but the same aspect of two different meanings, specifically the sound qualities peculiar to the cello and the clarinet. Despite the affinity mappings between the paired utterances, the resulting resonance signals not similarity, but contrast.
Indeed, as Du Bois (2014) identifies, in addition to foregrounding similarities across dialogically juxtaposed utterances, resonance also indexes relations of contrast. The pairing of utterances with parallel structures and affinity mappings in fact puts the spotlight on any contrast between counterpart elements. In (7), aside from the formal negation, the ostensible parallelisms really highlight the difference in sound quality between the two depictions, allowing the student to perceive the contrast between the “correct” and “incorrect” renditions, not through descriptive speech, but directly in the target modality of cello playing. This pedagogical implication echoes Keevallik’s (2010, this issue) findings in dance classes, where the instructor “quotes” correct and incorrect movements to show the student where the contrast lies.
The communicative potential of parallel iterations is further exploited in (8), where the instructor references the “mosquito effect” in the student’s playing, which he is singling out for the second time, having already identified it in prior discourse (see Supplementary Video 5).
Ok, careful, | the mosquito effect, |
[plays cello with the “mosquito effect”], | |
[plays cello without the “mosquito effect”]. |
With the three multimodal iterations embedded on the same level and juxtaposed in sequentiality, the parallelisms and the resulting resonance are patent. Strikingly, while the dialogic resonance marks the similarity between the first and second iterations (that both illustrate the student’s problem), it profiles a relation of contrast between the second and third iterations (which are contrasting renditions of the same literal musical notes). Through the mappings across the parallel structures alone, the instructor is able to communicate, and the student is able to understand, that the first two iterations illustrate the problem, but that the third exemplifies the desired practice. Not only is the resonance emergent without any explicit prompt, the identification of the two polar relations indexed by the resonance also requires no guidance from any formal cues.
6 Discussion and concluding remarks
Multimodal iteration draws on a rich pool of resources, from modality-specific semiotic properties, form-meaning relations of different signaling methods, to the striking communicative potential of dialogic resonance. Though drawn only from two master classes, the tokens in our corpus showcase the plethora of ways in which the speaker taps into the affordances of multimodal iteration in carrying out the communicative task at hand, which is the pedagogical goal of transferring embodied knowledge to another person. Despite the subject matter — cello playing — being nonverbal and procedural in nature, the instructor takes advantage of the possibilities of multimodal iteration in creative ways on the fly to get his message across to the students.
Among the tokens in our corpus, multimodal iteration essentially serves the following functions:
Referencing the problem in the student’s playing,
as is, in the original modality-signaling method combination, i.e. cello playing
in another modality-signaling method combination, to highlight the suboptimal aspects; and
demonstrating the desired, “correct” way of playing,
as is, in the original modality-signaling method combination, i.e. cello playing
in another modality-signaling method combination, to single out the desired properties.
Each of the examples presented serves some or all of these functions, with which the instructor highlights multiple aspects of the same idea through different combinations of modality and signaling method, but also contrasts “correct” and “incorrect” cello playing, thereby contributing to a fuller understanding of both on the student’s part. Given the lack of explicit prompts in most cases, whether the relation between juxtaposed iterations is to be interpreted as one of similarity or contrast is dependent on the highly context-sensitive mechanisms of dialogic resonance. Though nuanced and often covert, multimodal iteration bears significant consequences to the sequential unfolding of the discourse.
While the communicative affordances of juxtaposed elements have been explored by many, and across modalities (e.g. Arnold 2012; Keevallik 2010; Teng and Sun 2002), in the present study, we offer a concise but systematic account of how such juxtaposition is deployed to achieve communicative functions in the instructional context of cello classes, where the subject matter is nonverbal, procedural, and embodied. Through the examination of nonverbal depictions without simultaneously co-occurring speech — an oversight in the literature identified through reconceptualizing the typology of depictions in Clark’s framework — in cello master classes, we identify multimodal iteration as a strategy for transfer of embodied knowledge. With data approximating naturally occurring language use, we demonstrate how the rich potential of multimodal iteration is creatively utilized by the instructor on the fly, communicating with efficiency meanings which would otherwise be difficult to verbalize.
The present study offering a first glimpse into the full complexity of multimodal iteration in transfer of embodied knowledge, natural next steps in this direction include examination of other comparable contexts of instruction, consideration of the roles and functions of artifacts (e.g. musical instruments, sheet music), exploration of the interactional dynamics between the instructor and the students, and, importantly, research on the actual transfer of embodied knowledge. As the current study is focused on how the instructor capitalizes on multimodal iteration, research on the receiving end of multimodal iteration will help complete the picture, as will further investigation on the precise ways in which individual iterations relate to the meaning of which they serve to profile select aspects.
References
Arnold, Lynnette. 2012. Dialogic embodied action: Using gesture to organize sequence and participation in instructional interaction. Research on Language and Social Interaction 45(3). 269–296. https://doi.org/10.1080/08351813.2012.699256.Suche in Google Scholar
Bavelas, Janet Beavin & Nicole Chovil. 2000. Visible acts of meaning: An integrated message model of language in face-to-face dialogue. Journal of Language and Social Psychology 19(2). 163–194. https://doi.org/10.1177/0261927X00019002001.Suche in Google Scholar
Bressem, Jana. 2013. A linguistic perspective on the notation of form features in gestures. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Sedinha Tessendorf (eds.), Body — language — communication, vol. 1, 1079–1098. Berlin: De Gruyter Mouton.10.1515/9783110261318.1079Suche in Google Scholar
Chafe, Wallace. 1977a. Creativity in verbalization and its implications for the nature of stored knowledge. In Freedle Roy (ed.), Discourse production and comprehension, 41–55. Norwood, NJ: Ablex.Suche in Google Scholar
Chafe, Wallace. 1977b. The recall and verbalization of past experience. In Peter Cole (ed.), Current issues in linguistic theory, 215–246. Bloomington: Indiana University Press.Suche in Google Scholar
Chovil, Nicole. 1991. Discourse‐oriented facial displays in conversation. Research on Language & Social Interaction 25(1–4). 163–194. https://doi.org/10.1080/08351819109389361.Suche in Google Scholar
Cienki, Alan & Cornelia Müller (eds.). 2008. Metaphor and gesture. Amsterdam: John Benjamins.10.1075/gs.3Suche in Google Scholar
Clark, Herbert H. 1996. Using language. Cambridge: Cambridge University Press.10.1017/CBO9780511620539Suche in Google Scholar
Clark, Herbert H. 2016. Depicting as a method of communication. Psychological Review 123(3). 324–347. https://doi.org/10.1037/rev0000026.Suche in Google Scholar
Clark, Herbert H. 2019. Depicting in Communication. In Hagoort Peter (ed.), Human language: From genes and brains to behavior, 235–247. Cambridge, MA: Massachusetts Institute of Technology Press.10.7551/mitpress/10841.003.0021Suche in Google Scholar
Clark, Herbert H. & Richard J. Gerrig. 1990. Quotations as demonstrations. Language 66(4). 764–805. https://doi.org/10.2307/414729.Suche in Google Scholar
Condon, William S. 1971. Speech and body motion synchrony of the speaker-hearer. In Paul M. Kjeldergaard, David L. Horton & James J. Jenkins (eds.), Perception of language, 150–173. Columbus: Merrill.Suche in Google Scholar
Cormier, Kearsy, David Quinto-Pozos, Zed Sevcikova & Schembri Adam. 2012. Lexicalisation and de-lexicalisation processes in sign languages: Comparing depicting constructions and viewpoint gestures. Language & Communication 32(4). 329–348. https://doi.org/10.1016/j.langcom.2012.09.004.Suche in Google Scholar
Cormier, Kearsy, Sandra Smith & Zed Sevcikova. 2016. Rethinking constructed action. Sign Language & Linguistics 18(2). 167–204. https://doi.org/10.1075/sll.18.2.01cor.Suche in Google Scholar
Croft, William. 2007. The origins of grammar in the verbalization of experience. Cognitive Linguistics 18(3). 339–382. https://doi.org/10.1515/COG.2007.021.Suche in Google Scholar
Dingemanse, Mark. 2013. Ideophones and gesture in everyday speech. Gesture 13(2). 143–165. https://doi.org/10.1075/gest.13.2.02din.Suche in Google Scholar
Dingemanse, Mark. 2017. On the margins of language: Ideophones, interjections and dependencies in linguistic theory. In N. J. Enfield (ed.), Dependencies in language, 195–203. Berlin: Language Science Press.Suche in Google Scholar
Dingemanse, Mark. 2019. ‘Ideophone’ as a comparative concept. In Kimi Akita & Prashant Pardeshi (eds.), Iconicity in Language and Literature, vol. 16, 13–33. Amsterdam: John Benjamins.10.1075/ill.16.02dinSuche in Google Scholar
Du Bois, John W. 2014. Towards a dialogic syntax. Cognitive Linguistics 25(3). 359–410.10.1515/cog-2014-0024Suche in Google Scholar
Ehmer, Oliver. 2011. Imagination und Animation: Die Herstellung mentaler Räume durch animierte Rede. Berlin: De Gruyter Mouton.10.1515/9783110237801Suche in Google Scholar
Enfield, N. J. 2009. The anatomy of meaning: Speech, gesture, and composite utterances. Cambridge: Cambridge University Press.10.1017/CBO9780511576737Suche in Google Scholar
Fricke, Ellen. 2012. Grammatik multimodal: Wie Wörter und Gesten zusammenwirken. Berlin: De Gruyter Mouton.10.1515/9783110218893Suche in Google Scholar
Fricke, Ellen. 2013. Towards a unified grammar of gesture and speech: A multimodal approach. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Sedinha Tessendorf (eds.), Body — language — communication, vol. 1, 733–754. Berlin: De Gruyter Mouton.10.1515/9783110261318.733Suche in Google Scholar
Gärdenfors, Peter. 2017. Demonstration and Pantomime in the Evolution of Teaching. Frontiers in Psychology 8(415). 1–12. https://doi.org/10.3389/fpsyg.2017.00415.Suche in Google Scholar
Goodwin, Marjorie Harness & Charles Goodwin. 1986. Gesture and coparticipation in the activity of searching for a word. Semiotica 62(1–2). 51–76.10.1515/semi.1986.62.1-2.51Suche in Google Scholar
Gullberg, Marianne. 1998. Gesture as a communication strategy in second language discourse: A study of learners of French and Swedish (Travaux de l’Institut de Linguistique de Lund 35). Lund: Lund University Press.Suche in Google Scholar
Harrison, Simon. 2018. The impulse to gesture: Where language, minds, and bodies intersect. Cambridge: Cambridge University Press.10.1017/9781108265065Suche in Google Scholar
Hsu, Hui-Chieh, Geert Brône & Kurt Feyaerts. 2021. When gesture “takes over”: Speech-embedded nonverbal depictions in multimodal interaction. Frontiers in Psychology 11. 552533. https://doi.org/10.3389/fpsyg.2020.552533 Suche in Google Scholar
Johnston, Trevor. 1996. Function and medium in the forms of linguistic expression found in a sign language. In William H. Edmondson & Ronnie B. Wilbur (eds.), International review of sign linguistics, vol. 1, 57–94. Mahwah, NJ: Lawrence Erlbaum.Suche in Google Scholar
Keevallik, Leelo. 2010. Bodily quoting in dance correction. Research on Language and Social Interaction 43(4). 401–426. https://doi.org/10.1080/08351813.2010.518065.Suche in Google Scholar
Keevallik, Leelo. 2015. Coordinating the temporalities of talk and dance. In Arnulf Deppermann & Susanne Günthner (eds.), Temporality in interaction, 309–336. Amsterdam: John Benjamins.10.1075/slsi.27.10keeSuche in Google Scholar
Keevallik, Leelo. 2017. Linking performances: the temporality of contrastive grammar. In Ritva Laury, Marja Etelämäki & Elizabeth Couper-Kuhlen (eds.), Linking clauses and actions in social interaction. Helsinki: Finnish Literature Society.Suche in Google Scholar
Keevallik, Leelo. 2018. What does embodied interaction tell us about grammar? Research on Language and Social Interaction 51(1). 1–21. https://doi.org/10.1080/08351813.2018.1413887.Suche in Google Scholar
Keevallik, Leelo. 2020. Multimodal noun phrases. In Tsuyoshi Ono & Sandra A. Thompson (eds.), The ‘Noun Phrase’ across languages: An emergent unit in interaction, 154–177. Amsterdam: John Benjamins.10.1075/tsl.128.07keeSuche in Google Scholar
Kendon, Adam. 1988. How gestures can become like words. In Fernando Poyatos (ed.), Cross-cultural perspectives in non-verbal communication, 131–141. Toronto: Hogrefe.Suche in Google Scholar
Kendon, Adam. 2004. Gesture: Visible action as utterance. Cambridge: Cambridge University Press.10.1017/CBO9780511807572Suche in Google Scholar
Kita, Sotaro. 1997. Two-dimensional semantic analysis of Japanese mimetics. Linguistics 35. 379–415. https://doi.org/10.1515/ling.1997.35.2.379.Suche in Google Scholar
Kok, Kasper I. & Alan Cienki. 2016. Cognitive Grammar and gesture: Points of convergence, advances and challenges. Cognitive Linguistics 27(1). 67–100. https://doi.org/10.1515/cog-2015-0087.Suche in Google Scholar
Ladewig, Silva H. 2020. Integrating gestures: The dimension of multimodality in cognitive grammar. Berlin: De Gruyter Mouton.10.1515/9783110668568Suche in Google Scholar
Liddell, Scott K. 2003. Grammar, gesture, and meaning in American sign language. Cambridge: Cambridge University Press.10.1017/CBO9780511615054Suche in Google Scholar
Mandel, Mark. 1977. Iconic devices in American sign language. In Lynn A. Friedman (ed.), On the other hand: New perspectives on American sign language, 57–108. New York: Academic Press.Suche in Google Scholar
Masterclass Media Foundation. 2007. Steven Isserlis at the International Musicians’ SeminarPrussia Cove — Sergei Rachmaninov: Cello Sonata in g, Op. 19. DVD. Bristol: The Masterclass Media Foundation.Suche in Google Scholar
Masterclass Media Foundation. 2008. Steven Isserlis at the International Musicians’ Seminar, Prussia Cove — Robert Schumann: Fantasiestücke, Op. 73. DVD. Bristol: The Masterclass Media Foundation.Suche in Google Scholar
McNeill, David. 1992. Hand and mind: What gestures reveal about thought. Chicago: University of Chicago Press.Suche in Google Scholar
McNeill, David. 2005. Gesture and thought. Chicago: University of Chicago Press.10.7208/chicago/9780226514642.001.0001Suche in Google Scholar
McNeill, David. 2013. The growth point hypothesis of language and gesture as a dynamic and integrated system. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Sedinha Tessendorf (eds.), Body — language — communication, vol. 1, 135–155. Berlin: De Gruyter Mouton.10.1515/9783110261318.135Suche in Google Scholar
Mittelberg, Irene. 2014. Gestures and iconicity. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Jana Bressem (eds.), Body — language — communication, vol. 2, 1712–1732. Berlin: De Gruyter Mouton.Suche in Google Scholar
Mondada, Lorenza. 2019. Contemporary issues in conversation analysis: Embodiment and materiality, multimodality and multisensoriality in social interaction. Journal of Pragmatics 145. 47–62. https://doi.org/10.1016/j.pragma.2019.01.016.Suche in Google Scholar
Müller, Cornelia. 2014. Gestural modes of representation as techniques of depiction. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Jana Bressem (eds.), Body — language — communication, vol. 2, 1687–1702. Berlin: De Gruyter Mouton.10.1515/9783110302028.1687Suche in Google Scholar
Müller, Cornelia, Silva H. Ladewig & Jana Bressem. 2013. Gestures and speech from a linguistic perspective: A new field and its history. In Cornelia Müller, Alan Cienki, Ellen Fricke, Silva Ladewig, David McNeill & Sedinha Tessendorf (eds.), Body — language — communication, vol. 1, 55–81. Berlin: De Gruyter Mouton.10.1515/9783110261318.55Suche in Google Scholar
Peirce, Charles Sanders. 1932. The icon, index, and symbol. In Charles Hartshorne & Paul Weiss (eds.), Collected papers of Charles Sanders Peirce, vol. 2, 156–173. Cambridge, MA: Harvard University Press.Suche in Google Scholar
Sambre, Paul & Kurt Feyaerts. 2017. Embodied musical meaning-making and multimodal viewpoints in a trumpet master class. Journal of Pragmatics 122. 10–23. https://doi.org/10.1016/j.pragma.2017.09.004.Suche in Google Scholar
Streeck, Jürgen. 2008. Depicting by gestures. Gesture 8(3). 285–301. https://doi.org/10.1075/gest.8.3.02str.Suche in Google Scholar
Streeck, Jürgen. 2009. Gesturecraft: The manu-facture of meaning (Gesture Studies v. 2). Amsterdam: John Benjamins.10.1075/gs.2Suche in Google Scholar
Stukenbrock, Anja. 2012. Imagined spaces as a resource in interaction. Bulletin Suisse De Linguistique Appliquée 96. 141–161.Suche in Google Scholar
Taylor, Millie. 2007. British pantomime performance. Bristol: Intellect Books.Suche in Google Scholar
Teng, Norman Y. & Sewen Sun. 2002. Grouping, simile, and oxymoron in pictures: A design-based cognitive approach. Metaphor and Symbol 17. 295–316. https://doi.org/10.1207/s15327868ms1704_3.Suche in Google Scholar
Vandelanotte, Lieven. 2009. Speech and thought representation in English: A cognitive-functional approach (Topics in English Linguistics 65). Berlin: De Gruyter Mouton.10.1515/9783110215373Suche in Google Scholar
Wade, Elizabeth & Herbert H. Clark. 1993. Reproduction and demonstration in quotations. Journal of Memory and Language 32(6). 805–819. https://doi.org/10.1006/jmla.1993.1040.Suche in Google Scholar
Zima, Elisabeth & Alexander Bergs. 2017. Multimodality and construction grammar. Linguistics Vanguard 3(s1). 20161006. https://doi.org/10.1515/lingvan-2016-1006.Suche in Google Scholar
Zlatev, Jordan. 2005. What’s in a schema? Bodily mimesis and the grounding of language. In Beate Hampe (ed.), From perception to meaning: Image schemas in cognitive linguistics (Cognitive Linguistics Research), vol. 29, 313–342. Berlin: De Gruyter Mouton.10.1515/9783110197532.4.313Suche in Google Scholar
Supplementary Material
The online version of this article offers supplementary material (https://doi.org/10.1515/lingvan-2020-0086).
© 2021 Hui-Chieh Hsu et al., published by De Gruyter, Berlin/Boston
This work is licensed under the Creative Commons Attribution 4.0 International License.
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Instructing embodied knowledge: multimodal approaches to interactive practices for knowledge constitution
- Singing and the body: body-focused and concept-focused vocal instruction
- In other gestures: Multimodal iteration in cello master classes
- Vocalizations in dance classes teach body knowledge
- Synchronization in demonstrations. Multimodal practices for instructing body knowledge
- Situating embodied action plans: pre-enacting and planning actions within knowledge communication in sports training
- Taking the trumpet up there: enactment of embodied high pitch in a multimodal body schema
- Monitoring and evaluating body knowledge: metaphors and metonymies of body position in children’s music instrument instruction
- Situating embodied instruction – proxemics and body knowledge
- The social construction of embodied experiences: two types of discoveries in the science centre
Artikel in diesem Heft
- Frontmatter
- Research Articles
- Instructing embodied knowledge: multimodal approaches to interactive practices for knowledge constitution
- Singing and the body: body-focused and concept-focused vocal instruction
- In other gestures: Multimodal iteration in cello master classes
- Vocalizations in dance classes teach body knowledge
- Synchronization in demonstrations. Multimodal practices for instructing body knowledge
- Situating embodied action plans: pre-enacting and planning actions within knowledge communication in sports training
- Taking the trumpet up there: enactment of embodied high pitch in a multimodal body schema
- Monitoring and evaluating body knowledge: metaphors and metonymies of body position in children’s music instrument instruction
- Situating embodied instruction – proxemics and body knowledge
- The social construction of embodied experiences: two types of discoveries in the science centre