Home Linguistics & Semiotics Self-talk and syntactic structure
Article Open Access

Self-talk and syntactic structure

  • Manfred Krifka EMAIL logo
Published/Copyright: November 11, 2025
Become an author with De Gruyter Brill

Abstract

This commentary addresses syntactic aspects of self-talk, as well as the relevance of literary sources and theories of the self.

1 Introduction

Martina Wiltschko’s target article, together with her previous work on the topic like Ritter and Wiltschko (2021), makes an important contribution. On the one hand, it draws self-talk into linguistic research, a topic that has been neglected in linguistics with a few exceptions such as Holmberg (2010) and Geurts (2018), in spite of being important in other fields such as psychology and philosophy (cf. the recent survey articles of Fernyhough and Borghi 2023; Latinjak et al. 2023). On the other, it hopefully will lead to a better understanding of the types of self-talk based on recognition of linguistic features in these fields. The target article also makes a more specific contribution in using data from self-talk to argue for a particular theory of syntactic representation, the interactional spine hypothesis of Wiltschko (2021).

In this commentary, I will address the following points: Section 2 discusses the issue of the syntactic representation of speaker and addressee in syntactic structure. Section 3 considers arguments from self-talk for the representation of grounding nodes versus nodes that represent commitment and judgment operators. Section 4 is a plea for using literary texts as evidence for self-talk, and Section 5 is a plea to consider the insights on self-talk from psychology and other fields.

2 Syntactic representation of participants?

The target article is not just interested in self-talk per se but uses it to derive arguments for the syntactic architecture of language, in particular the way how speaker and addressee are represented in syntactic structure. Wiltschko distinguishes between three formats, which are all inspired by the performative analysis of Ross (1970). In particular, she identifies the “neo-performative” analyses originating in Speas and Tenny (2003), which share the assumption of syntactic nodes for speaker and addressee, and contrasts them with her own “interactional spine” analysis, in which participants are interpreted more indirectly as relevant for the interpretation of syntactic nodes called “speaker’s ground” and “hearer’s ground”. The third proposal is the “commitment” account of Krifka (2023a) that also assumes additional syntactic nodes, but does not assign them to specific participants.

In this section I will argue that the phenomena that are used to argue for a syntactic representation of speaker and addressee are not conclusive. This applies to the neo-performative analyses, e.g. to Miyagawa (2022), where they are treated as functional projections. It may also apply to Wiltschko’s own account, insofar as the nodes of “speaker’s ground” and “hearer’s ground” are related to the speaker and addressee parameters of semantic interpretation.

Wiltschko motivates the presence of speaker and addressee in syntactic structure with allocutive agreement, a phenomenon that is prominent in Miyagawa’s work. See (1) for her example of allocutive agreement in Basque.

(1)
Pett-ek lan egin dik / din
Peter-erg work do.perf aux.adr.m aux.adr.f
Peter worked’ (uttered to a familiar male / female addressee)

In a neo-performative framework, allocutivity can be dealt with by the well-known phenomenon of agreement. This is sketched in (2), where the allocutive marker k agrees with the addressee and speaker node (assuming 2 is male and 1 and 2 stand in a familiar social relation to each other).

(2)
[Spk 1 [Adr 2 [[Pett-ek lan egin di-]k 1,2]]]

The masculine allocutive marker is in a position to agree with the addressee node and with the speaker node. While it appears a bit surprising that agreement is with both nodes, there are other cases of grammatical agreement that combine speaker and addressee, namely inclusive pronouns like yumi ‘we’ (you and I) in Tok Pisin, so this is not ruled out in principle.

However, things are not quite as straightforward when also considering semantic interpretation. It is generally assumed that interpretation depends on certain parameters that refer to the utterance situation, which include, in typical conversation, the speaker and the addressee (cf. e.g. Lewis 1980). These parameters are relevant for the interpretation of indexical expressions, including 1st and 2nd person pronouns like I and you, but also for temporal and spatial deictics like yesterday and here. In theories of dynamic semantics, the content of the conversation up to the current point, including the discourse referents introduced so far that can be picked up by anaphoric expressions, is modelled similar to an index of interpretation. With this, syntactic nodes for speaker and addressee become superfluous for allocutive agreement, as it can be interpreted with respect to the parameters instead. This is illustrated in (3). Note that we need parameters for speaker and addressee anyway, even for the interpretation of the more elaborate structure like (2), as the nodes for speaker and addressee, Spk and Adr, require a semantic interpretation.

(3)
⟦[[Pett-ek lane gin di-]k]⟧spk,adr
= ⟦-kspk,adr(⟦[Pett-ek lane gin di-]⟧spk,adr)
= λpλw: male(adr), fam(spk,adr) [p(w)](λw[Pett worked in w])
= λw: male(adr), fam(spk,adr) [Pett worked in w]

One objection against the analysis in (3) might be that allocutive agreement does not occur in typical embedded clauses but is a root phenomenon (cf. Haddican 2018). If the Spk and Adr nodes only occur in root clauses, this would fall out as a consequence. However, it is sufficient to assume that the semantic contribution of the allocutive morpheme is similar to vocatives of the “call” variety, which are also a root phenomenon (cf. Hill 2007; for the distinction of calls vs. addresses cf. Zwicky 1974). Hence, allocutive agreement does not constitute an incontrovertible argument for the performative hypothesis.

I would like to point out that the relation between speaker and addressee can affect other phenomena as well. One is honorific 2nd person pronouns; however, this can occur in clauses that are arbitrarily deeply embedded, and hence could not be modelled by agreement with a Spk and Adr node in the main clause. Cf. the German example (4):

(4)
Es wird gesagt, dass in dem Café eine Violine gefunden wurde, die Ihnen gehören soll.
‘People say that a violin was found in the café that supposedly belongs to youhonorific

Another case in point is register choice triggered by the social relation between the participants, as in the choice of lexical items in Balinese (Arka 2005), which is determined by the relation between the casts of speaker and addressee. Cf. (5) for an example.

(5)
a.
Tiang numbas bawi-ne punika ring pasar high register
b.
Cang meli celenge ento di peken low register
I av.buy pig-def that that market
‘I bought the pig at the market’

Both honorific pronouns and participant-dependent register choice can be handled straightforwardly with a compositional interpretation with speaker and addressee as parameters of semantic interpretation. An account that assumes syntactic agreement of each lexical item with the speaker and addressee nodes may not be impossible, but it is implausible.

Wiltschko also mentions discourse particles that refer to attitudes of the speaker or the addressee as evidence for the syntactic representation of speaker and addressee. But such particles can also be modeled by accessing the speaker and/or addressee parameter provided by semantic interpretation directly. I conclude that the presence of allocutive agreement or the interpretation of participant-oriented particles cannot be taken as incontrovertible proof for the grammatical representation of speaker and addressee.

3 Grounding, commitment, or what?

Wiltschko compares the way how the interactional spine hypothesis deals with the various types of self-talk with my own attempt in Krifka (2023b) to relate self-talk to the “layers” account in Krifka (2023a). In this section I would like to add some thoughts to this discussion that altogether are not conclusive but perhaps worthwhile to consider.

Wiltschko (2021) relates the different kinds of self-talk to the general syntactic architecture of the interactional spine hypothesis. The different layers are associated with distinct operators that are in turn compatible with distinct forms of self-talk. For example, expressions that require the presence of an addressee, like commands, vocatives, and particles that address features of the epistemic state of the other person require a Ground-Adr, hence occur with you-centered self-talk. And operators that refer to interactional features, like rising intonation, do not occur in self-talk at all (a statement that is qualified later).

(6)
a.
[Ground-Spk [proposition]] I-centred self-talk
b.
[Ground-Adr [Ground-Spk [proposition]]] you-centered self-talk
c.
[RespP [Ground-Adr [Ground-Spk [proposition]]]] regular conversation

First, I would like to discuss the notions of speaker’s ground and addressee’s ground. Wiltschko (2021: 4.3.1) argues that one should not just talk about “the” common ground, the information that the interlocutors assume to be shared, but has to distinguish between the different beliefs and attitudes of the participants. From her short notes, I understand that grounding stands for the “mental representations of our thoughts about the world”, and that the complement proposition is meant to be part of these thoughts.

There is a problem with this interpretation when applied to (6). I take it that [Ground-Spkr [proposition]] captures the information that the proposition is part of the beliefs of the speaker. The syntactic structure [Ground-Adr [Ground-Spk [proposition]]] then suggests that it is part of the belief of the addressee that the speaker believes the proposition. It is unclear, or at least not worked out, whether this is indeed intended.

More fundamentally, it is not clear to me what the speaker’s ground and the addressee’s ground, as distinct of the shared common ground, is supposed to refer to in the first place. Of course, when rational agents communicate, they assume of each other that they have certain beliefs, many of which being unknown to the other participant. I do not think that this is meant by speaker’s ground and addressee’s ground, as we hold this belief about each other even if we do not communicate at all, and so it cannot be a particular feature of communication. More relevantly, participants have their own beliefs about the common ground at a particular point in communication, and they are aware that these beliefs may differ from each other. Conversations are rife with misunderstandings and often need corrections and repairs when such misunderstandings surface. Such misunderstandings, and ways to correct them, form the main topic of Ginzburg’s work on metacommunication (cf. e.g. Ginzburg 2012). Many conversational devices have to be understood as tools for checking and maintaining the common ground, and for developing it in a controlled way, like deaccentuation as marking of givenness. The use of “Ground-Spk” and “Ground-Adr” probably can be understood as consistent with this notion of beliefs about the common ground, but I am not quite sure about this, as an explicit semantic theory of these notions is lacking.

I would like to point out that even the core notion of a common ground has to represent the different stances of the participants. This is evident when a speaker spk asserts that he or she believes p, because then the proposition ‘spk believes p’ will enter the common ground. But this is also relevant if a speaker asserts a proposition p, because it is important to keep a record that it was spk who asserted p, and hence that it is spk who is responsible to support p if questioned (cf. Brandom 1983), and it is also the right of spk to retract p. There are theories of common ground that model information about the individual beliefs or commitments, e.g. the commitment states in Krifka (2015), which contain propositions like “spk ⊢ p” for ‘spk is committed to (i.e. vouches for the truth of) p’, which is supposed to become shared belief of spk and adr after spk has asserted p. In this sense, one can distinguish between the commitments of the speaker and the commitments of the addressee, where both are part of a larger common ground. Such models of common ground may provide a good basis for a theory of the notions of “Ground-Spk” and “Ground-Adr”, where the first relates to the propositions that spk is committed to, and the second to the propositions that adr is committed to.

In Krifka (2023b), I tried to relate the different types of self-talk to the proposal in Krifka (2023a) how speakers manage to get propositions accepted into the common ground. For regular assertions, I assumed the structure in (7a) (cf. Hengeveld 1989 for a related syntactic proposal). This structure is interpreted compositionally, in the following way: The Tense Phrase TP denotes the core proposition; the Judgement Phrase JudgeP denotes a judgment by a person and can host subjective epistemics like certainly and evidentials like apparently; the Commitment Phrase ComP denotes a commitment by a person and hosts modifiers that affect the strength by which the speaker vouches for the truth of the asserted proposition, like really, I swear and I guess; and the ActP is interpreted as an update of the common ground and can be modified by sentence adverbs like frankly.

(7)
a.
[ActP [ComP [JudgeP [TP proposition]]]] assertions
b.
[ActP [JudgeP [TP proposition]]] exclamations, wishes
c.
[ActP [TP proposition]] declarations, observations (miratives)
d.
[ActP [ComP [TP proposition]]] assertions, directives, commissives
e.
[DiscP [ActP [ …]]] incorporating discourse

In subsequent work, I developed this approach further. In Krifka (2024), I argued that root declarative clauses can be used in a number of different ways, e.g. for declarations, observations, exclamations and wishes, as in (7b,c). The structure (7d) is a natural assumption for assertions that are not mitigated by epistemics like certainly, evidentials like apparently or evaluatives like luckily. Directives as in You come here now and commissives like I will come to your party can be understood as involving a speaker’s commitment to the truth of the proposition as well, which can be imposed by authority in the case of directives. In Krifka (2025) I proposed that certain discourse-related functions of speech acts are interpreted within a layer of a discourse phrase, as in (7e). These functions are similar to the ones modelled by the mechanism of the table in Farkas and Bruce (2010), and the Discourse Phrase DiscP has similarities to the projection of the RespP in Wiltschko (2021).

The assumption of sentence structures as in (7a-e) has features in common with the interactional spine hypothesis in (6). In particular, different pragmatic uses of sentences are reflected in their morphosyntax above the truth-conditionally interpreted TP. These different morphosyntactic structures allow for the representation of different roles in which the addressee is involved in the speech act. Let us consider them in turn.

Standard assertions are assumed to have a commitment phrase, cf. (7a). I understand commitment, following Charles Sanders Peirce (cf. Tuzet 2006) and also Dieter Wunderlich (1976: 93), as a social notion: In an assertion, the speaker guarantees that the asserted proposition is true. This requires that there be some instance that backs up this guarantee. Peirce suggests that in ordinary conversation, this is the “esteem”, or reputation, of the speaker. Similar views are put forward by Haugh (2013), who argues that the expression of intentions in conversation comes with social obligations, and Geurts (2019), who proposes that conversations consist in undergoing social commitments. According to Geurts, such commitments for a proposition are relations between two persons to consider the proposition true. In my commentary on this article (Krifka 2019), I suggest that commitments involve not only the addressee but the whole society, if we take Peirce’s notion of reputation seriously. But of course, the addressee is most directly involved as the beneficiary of the commitment. In any case, we can assume at least two roles in commitments: a “commit-er” who vouches for a proposition, and a “commit-ee” who can take this as a guarantee that the proposition is true.

What does this mean for self-talk? The commit-er and the commit-ee cannot be completely identical. Peirce (1931: 2.252) talks about “judgments” as commitments towards oneself to hold the asserted proposition true,[1] and assumes that this relation obtains between the self and the “future” self, who engage in a dialogue.[2] It is suggestive that this split between two versions of the self can explain you-centered self-talk, as in (8). I added the version with the epistemic modifier certainly here to illustrate the contribution of a JudgeP modifier.

(8)
You (certainly) acted like a fool.spk,adr
spk commits to adr that (spk is certain that) adr acted like a fool.

In (8), the speaker-self accuses the addressee-self of some past foolish behavior; here the speaker-self represents the higher moral instance. This is also the case in (9), where the moral instance expresses a commitment that should encourage the addressee. Wiltschko talks about an “inner critic” or “inner coach”, which I take to be versions of the higher moral instance, talk to the self.

(9)
I know you can do it!spk,adr
spk strongly commits to adr that adr is able to do a salient task.3
  1. 3

    Here, know works as a strengthener of the commitment; note that it has to be stressed.

The commitment account leads us to also consider cases like (10) that do not contain an instance of the 2nd person pronoun but imply the notion of an addressee.

(10)
I (certainly) acted like a fool.spk,adr
spk commits to adr that (spk is certain that) spk acted like a fool.

In contrast to (8) and (9), example (10) is a confession, that is, it is directed towards a moral instance. In case of self-talk, this can be seen as the speaker’s conscience, an aspect of the self that is different from the confessor. In this case, reference to the addressee by you is not possible (as in #I must tell/confess to you that I acted like a fool), possibly because the speaker’s conscience is not conceived as an agent like an inner critic or coach. Wiltschko would not characterize cases like (10) as you-centered self-talk. What makes me think of them as involving some notion of addressee or audience is that they allow for mitigating or strengthening expressions, like certainly and definitely, which are used to weaken or strengthen the level of commitment.

The roles of the speaker and of the addressee do not have to be set to a self and a higher moral instance in self-talk. For example, in reasoning processes, a “current” epistemic self can be dissociated from a “past” epistemic self as in (11a). In reminiscing about the past, the current self can evaluate a past cognitive state as in (11b). And – a case that Wiltschko mentions as well – a speaker can talk to an externalized image, as in (11c). I will return to these different ways how the division of speaker and addressee can play out in Section 5 below. What is relevant for linguistic purposes is that they are expressed in forms that provide an addressee parameter, in addition to a speaker parameter.

(11)
a.
[I think I packed everything. Let’s think.] No, you forgot the umbrella.
b.
[Oh, and the holidays I spent in Marbella!] You really liked these times.
c.
[Looking into the mirror:] You have more wrinkles again.

There are other syntactic forms that require the presence of an addressee, in particular, imperatives. Wiltschko observes that they are ill-formed in I-centered self-talk because imperatives cannot have first person subjects. This is right for English, but in fact there are languages with a larger imperative paradigm that includes first and third persons, such as Evenki (cf. Aikhenvald 2017). First person imperatives have a hortative function, and we do find this in English as well; e.g. Let’s go! appears to me as plausible self-talk. It would be interesting to see whether languages with inclusive/exclusive distinction can use the former and the latter, which would be evidence for you-centered talk or I-centered talk, respectively.

We now turn to sentences of the form (7b) which do not have a commitment phrase that requires an addressee role, and which I analyzed as underlying declarative sentences that express exclamations, wishes and the like. Consider the following examples:

(12)
a.
I acted like a fool!spk,(adr)
spk expresses an emotional attitude to the proposition that spk acted like a fool
b.
You acted like a fool!spk,adr
spk express an emotional attitude to the proposition that adr acted like a fool

(12a) does not require an addressee for interpretation, and we can assume that it is simply dropped in self-talk, resulting in an instance of I-centered self-talk. However, if a 2nd person pronoun occurs in the sentence, the adr parameter must be present. I have the impression that (12b) can be used as self-talk as well, where the speaker (assuming the role of spk) expresses an attitude of disappointment about an externalized version of himself or herself (in the role of adr).

Holmberg (2010) claims that you-centered self-talk does not occur with verbs of cognition, citing cases like #You can’t believe your luck or #You can’t take this anymore. But note that these examples are to be interpreted as idiomatic exclamations when felicitous, as in I can’t believe my luck! and I can’t take it anymore! The experiencer of an exclamation must be the speaker, which explains this difference. Wiltschko points out that sentences like It looks like you can’t believe your luck are fine. Note that this is not an exclamation, but an epistemically qualified assertion, hence a CommitP, which allows for you-centered self-talk, as we have seen above.

There is yet another way of interpreting the addressee parameter in self-talk, namely in imagining an addressee. This can be done in the form of imagined conversations that play out in the speaker’s head. For example, the speaker may come home, discovering that the co-inhabitant has left the kitchen a mess before leaving on a vacation. It is conceivable that the speaker performs self-talk like (13) in the absence of any addressee, acting out an imaginary situation in which the addressee is present (cf. Geurts 2018 for imagined talk). This can be done for the purpose of emotional relief, as the emotions can be directed towards an addressee, even if this addressee is only imagined.

(13)
Where are you? Why did you leave the kitchen like that? I hate you!

The syntactic structure (7c) neither has a commitment phrase nor a judge phrase. I argued that they underly declarations like The buffet is open (uttered by a host), which change the social world, hence would not occur in self-talk. But, with a different semantic operator, such structures also underly observational sentences that verbalize an aspect of the situation, without expressing a commitment or a judgement. For example, with (14a) a speaker can transfer a feature of the non-linguistic situation into the domain of language.

(14)
a.
There is a fly on your shoulderspk,adr
spk verbalizes a feature of the situation, that there is a fly on adr’s shoulder
b.
This fly is buzzing again.spk,(adr)
spk is verbalizing an aspect of the situation, that a given fly is buzzing
(presupposing that it was buzzing before)

Such observational sentences can be used in regular conversation as in (14a), but they also occur in self-talk as in (14b). This appears to be always I-centered self-talk, as there is no reason to split the self in different roles when verbalizing an observation. It is my impression that attention-getting phrases that are otherwise typical for observational sentences, as in Look, there’s a fly on your shoulder, are infelicitous in self-talk. The verbalization of aspects of the situation is arguably an important function of self-talk in general: It has been argued that the combinatorial system of language is a powerful tool to relate domain-specific cognitive systems to each other, thus providing a unique advantage to human cognition that goes beyond purely perceptional and other non-linguistic abilities (cf. e.g. Shusterman and Spelke 2005).

I return to a form that requires an addressee role because it contains a commitment phrase, (7d). Beyond their use for assertions without any judgement modifiers, such forms can also be used for commitments in which the speaker makes a promise as in (15a) or utters a command as in (15b), which involve propositions about the future. Commitment to the truth is to be understood here to “see to it that the proposition is true”, where the speaker has to be in a position to make the proposition true.

(15)
a.
I will never get drunk again.spk,adr
spk commits to adr that spk will never get drunk again.
b.
You will never get drunk again.spk,adr
spk commits to adr that adr will never get drunk again.

These examples can also be used in self-talk. In (15a) the speaker commits to the self as a higher moral instance to never get drunk again. This is similar to the confession (10). In (15b) the speaker is the higher moral instance and commits to self to never get drunk again, which is similar to (8).

The listing of patterns in (7) also includes the discourse phrase in (7e), which is similar to the RespP in the interactional spine structure. As this layer contains expressions that regulate the progression of conversation, we should not expect that expressions that are interpreted there occur in self-talk. But Wiltschko showed that Austrian ma and English oh!, in spite of being particles in RespP, do occur in self-talk. The logic of her argument would require, I think, that it only occurs in you-centered self-talk, but Wiltschko gives also examples with I-centered self-talk as in her (59a), Oh, I’m such an idiot. However, these particles may also have expressive meanings, like surprise or lament, which are compatible with I-centered self-talk.

Independent of such examples, I can imagine that discourse-oriented expressions can occur in deliberating thoughts within oneself. I can imagine sequences of self-talk like the following:

(16)
When will T. be home? At 7, no… at 6. She doesn’t have sports today.

In the last clause, I can easily imagine nämlich in German (sie hat heute nämlich keinen Sport), which, according to Wiltschko, should be excluded in self-talk. In this dialogical mode of self-talk we probably should expect all features of regular conversation, with the likely exception of those related to turn-taking.

I’d like to close this section with the topic of vocatives, which are a recognized feature of you-centered self-talk. In her target article, Wiltschko refers to Zwicky (1974), as vocatives to get or maintain attention of the addressee. Actually, Zwicky distinguishes two kinds of vocatives, “calls” to get attention, and “addresses” to express a kind of relationship between speaker and addressee. Typically, vocatives in self-talk are the latter kind, which includes epithets like idiot and my friend, whereas attention-getting vocatives belong to the genuine discourse devices. But even here one could imagine vocatives with the calling contour in self-talk, like in alerting one’s attention, as in Man fred , wake up!

4 Self-talk in literature

Getting evidence for self-talk is tricky (at least for the kind that is not considered pathological). Researchers may report about their own self-talk habits, but our memories may fail us, and introspective data is easily contaminated by the theories we hold about the phenomena to be investigated. Objective evidence may come from the observation of persons that are encouraged to engage in overt self-talk when solving tasks or when engaged in exercises (Gibson and Foster 2007), or are supposed to report their self-talk whenever they are alerted by an acoustic signal (Brinthaupt and Morin 2023).

I am impressed by the method of Goddard et al. (2022), who ask subjects to fill in words in speaking and thinking bubbles in cartoons; this variation of the storyboard method allows for the production of self-talk that is tightly constrained by the context, and can be compared to regular conversation. It should be noted that this method presupposes familiarity with a particular literary form, and asks the participant to engage in literary production. This opens up existing literature as a potential source of evidence for self-talk. Wiltschko mentions literature as evidence for self-talk as well but cautions that we should not take this as naturally occurring data. However, it appears to be as natural as the data from an experiment that asks participants to fill in text in thought bubbles of comic strips. It has the disadvantage, though, that it cannot be as controlled as production data in an experiment, and it cannot be repeated.

There is one interesting use of literary works that should be mentioned: If the work of literature is translated, we can study comparative self-talk across languages. The study of expression equivalents in translations of literary texts has turned out to be a valuable tool in the investigation of grammatical features, such as tenses and aspects of verbs or the marking of definiteness with nominal expressions (cf. Le Bruyn et al. 2022). This is due to the fact that literary translations are often of a particularly high quality, done by persons that know the range of possibilities in their respective languages well. Hence translations of text that include interior monologues can be used to investigate commonalities and differences of self-talk between different languages and cultures.

It is important to be aware of different literary traditions to capture self-talk, and also of the fact that self-talk might be different in different cultures and across different times. In the Western tradition, soliloquies can be traced back to Greek tragedies, where characters talk to the audience or to the chorus, revealing their thoughts and inner motivations. In the monologues of the Elizabethan era, protagonists like Hamlet reveal their inner thoughts, with the audience as “bystanders”. To illustrate, in the following soliloquy in Christopher Marlowe’s The Jew of Malta, the main character Barabas refers to himself in the 2nd person and by vocatives, and also utters commands to himself.[4]

(17)
Thus hast thou gotten, by thy policy,
No simple place, no small authority.
Now am governor of Malta. True,
But Malta hates me, and in hating me
My life’s in danger, and what boots it thee,
Poor Barabas, to be the governor,
Whenas thy life shall be at their command?
No, Barabas, this must be look’d into;
And should by wrong thou gott’st authority,
Maintain it bravely by firm policy;
At least unprofitably lose it not. (V, 2)

Clearly, this was not designed to represent the raw thought events, just as dialogue in verses was not meant to record the way how people actually spoke. Rather, it was stylized to create an aesthetic experience for the audience. With the advent of realistic theatre in the 19th century, the way how characters talked to each other became closer to actual conversation. But for the very reason that drove the development of realism, soliloquies became rarer. In newer, anti-naturalistic developments, as in the theatre by Bert Brecht, actors could talk through the “fourth wall” to the public again, but did this in a stylized way. This is most pronounced in musicals when characters break out in songs when they want to convey their inner feelings.

While drama is a problematic genre to investigate self-talk, epical forms became highly relevant, in the form of narratives that probe into the mind of their characters from a first-person perspective. In European culture, there was a clear development from the external description of events and characters to their representation from inside, in forms like free indirect discourse that mix the external perspective of the author and the inner experience of the character (cf. Doron 1991). The most extreme form, and the most interesting one for linguistic purposes, are the interior monologue and its extreme form, sometimes called stream of consciousness, that attempts to describe the thoughts of the character from within.

The most famous of these works is James Joyce’s Ulysses, which is driven by the ambition to give a precise and very detailed portray of a society at a particular point in time, and of the thoughts of three very different characters.[5] This work gives not only evidence about what people did and how they talked in Dublin in 1904, but also about how they thought. It also makes the point that the three main characters do not only think different things, but think quite differently. In a rare study that uses linguistic techniques, Gast et al. (2023) find that Leopold Bloom’s thoughts differ stylistically more from his actual speech than the thought and speech patterns of Stephen Dedalus and Molly Bloom. Most relevant for the current topic, they find that the young intellectual Stephen refers to himself more often than the other characters, both with 1st and 2nd person pronouns (in addition to the generalizing use of you). The typical mode of reference by you is when Stephen reflects over past events, which creates a distance between the current, reflecting self and the past self, as in the following example:

(18)
Houses of decay, mine, his and all. You told the Clongowes gentry you
had an uncle a judge and an uncle a general in the army. Come out of
them, Stephen. Beauty is not there. Nor in the stagnant bay of Marsh’s
library where you read the fading prophecies of Joachim Abbas.

In Krifka (2023b), I analyzed the novella Leutnant Gustl by the Austrian writer Arthur Schnitzler, which was published in 1900 and which consists of a long interior monologue of its hero, a young brash officer in Vienna. It was considered so offensive that the author was stripped of his military status as a lieutenant of the Austrian-Hungarian army. In the rambling thought processes of Gustl, one can clearly identify passages of I-centered self-talk and you-centered self-talk, where the latter uses the 2nd person pronoun, vocatives, and commands to oneself. I argued that this can be described in terms of the later distinction by Sigmund Freud between Überich (Super-Ego) and Ich (Ego) in Freud (1923), as in these passages, the speaker berates himself as failing the standards of society. Interestingly, this aspect seems to be missing in Stephen Dedalus’s use of you in his interior monologues.

To illustrate the value of literary texts with one example, I would like to point out a feature of the other novella by Schnitzler that is completely narrated in the form of an interior monologue, Fräulein Else (published in 1924). Goddard et al. (2022) tested the form of self-talk when the speaker is looking at himself or herself in a mirror, finding a clear tendency for you-centered self-talk. In Fräulein Else, the only passage in you-centered self-talk is when the protagonist looks at herself in the mirror.

I consider different versions of interior monologues highly interesting sources to learn more about self-talk, or at least about our thinking of self-talk. To engage in this sort of evidence would require a close cooperation with literary studies, where interior monologues and other ways of depicting thoughts have been of central concern. In the next section, I will point out that we also need to go into the discussion of theories of the self that have been developed in psychology.

5 Self-talk and theories of the self

As mentioned in the introduction of this short comment, people have reasoned about self-talk for a long time; astonishingly, linguistics is a late-comer to this discussion. There is a wealth of insights and opinions about what happens when we talk to ourselves, or when an inner voice talks to us, from philosophy and theology to psychology and brain sciences. Engaging with this intriguing phenomenon in linguistics should better be informed about these discussion, as they may contain valuable insights. I would like to close with discussing one of them.

I have mentioned Freud’s distinction between Ego and Super-Ego that reveals new insights about Leutnant Gustl. One slightly earlier theory that might be particularly relevant for the study of self-talk goes back to William James, who introduced the distinction between the “I” and the “me” (James 1892: 159):

“Whatever I may be thinking of, I am always at the same time more or less aware of myself, of my personal existence. At the same time it is I who am aware; so that the total self of me, being as it were duplex, partly known and partly knower, partly object and partly subject, must have two aspects discriminated in it, of which for shortness we may call one the Me and the other the I.”

James goes on to distinguish between different aspects of the “Me”, namely the empirical, the material, the social, and the spiritual. Of the “I”, he says (James 1892: 175):

“The I, or ‘pure ego’, is […] that which at any given moment is conscious, whereas the Me is only one of the things which it is conscious of. In other words, it is the Thinker.”

We have discussed a number of different modes of self-talk above. The distinctions that James makes can be fruitfully applied to them, as illustrated in the following. There is no differentiation between the self in (19). In (19a), the speaker just verbalizes something that is going on in the situation, and in (19b), the speaker expresses some emotional attitude towards a past event or proposition taken as a fact.

(19)
a.
There’s a fly buzzing again.
b.
I acted like a fool!

We have seen in examples (8) to (12) that in self-talk that distinguishes between a speaker role and an addressee role, the relations between the speaker and the addressee can be quite different. I illustrate this with examples (20).

(20)
a.
You acted like a fool. Reflecting I addressing social Me.
b.
You forgot the umbrella. Reflecting I addressing empirical Me.
c.
[Mirror] You look good! Reflecting I addressing material Me.

It is unclear how to analyze the “confessing” interpretation of assertions as in (10) and in (21) in William James’ typology, in which the speaker (I) is addressing a higher authority.

(21)
I (must say, I) acted like a fool.

I first was inclined to see this as an instance of the social me addressing the reflecting I, thus as the inverse of (20a). However, it seems that it is always the reflecting I that takes the agentive role of the speaker, so this analysis cannot be right. Rather, the reflecting I addresses a moral instance that is different from any version of Me, which would make (21) an instance of imagined communication (cf. Geurts 2018), or even of real conversation. This is different from (20a), where the reflecting I takes on the role of the moral instance that is judging the social Me.

In concluding, I would like to stress that linguistics can profit tremendously from engaging with existing theories of self-talk. The inverse is true as well, of course: Linguistic facts and arguments have the potential of guiding our engagement with the different forms of self-talk. Wiltschko’s work is a very important step into this direction.


Corresponding author: Manfred Krifka, Leibniz-Zentrum Allgemeine Sprachwissenschaft Berlin (ZAS), Pariser Straße 1, 10719 Berlin, Germany, E-mail:
This paper is inspired by work done in the ERC Advanced Grant 787929 SPAGAD, “Speech Acts in Grammar and Discourse” that ran from 2019 to 2024. I dedicate this small piece to the memory of John “Haj” Ross (1938–2025), who I had the pleasure to meet a number of times in Austin and in Berlin, and whose ideas have shaped linguistic theory beyond the performative hypothesis that is being discussed here.

References

Aikhenvald, Alexandra. 2017. Imperatives and commands: A cross-linguistic view. In Alexandra Aikhenvald & Robert M. W. Dixon (eds.), Commands, 1–45. Oxford: Oxford University Press.10.1093/oso/9780198803225.003.0001Search in Google Scholar

Arka, I. Wayan. 2005. Speech levels, social predicates and pragmatic structure in Balinese. Pragmatics 15. 169–203. https://doi.org/10.1075/prag.15.2-3.02ark.Search in Google Scholar

Brandom, Robert. 1983. Asserting. Noûs 17. 637–650. https://doi.org/10.2307/2215086.Search in Google Scholar

Brinthaupt, Thomas M. & Alain Morin. 2023. Self-talk: Research challenges and opportunities. Frontiers in Psychology 14. 1210960. https://doi.org/10.3389/fpsyg.2023.1210960.Search in Google Scholar

Doron, Edith. 1991. Point of view as a factor of content. SALT 1. 51–65.10.3765/salt.v1i0.2997Search in Google Scholar

Farkas, Donka & Kim, Bruce. 2010. On reacting to assertions and polar questions. Journal of Semantics 27. 81–118.10.1093/jos/ffp010Search in Google Scholar

Fernyhough, Charles & Anna M. Borghi. 2023. Inner speech as language process and cognitive tool. Trends in Cognitive Sciece 27. 1180–1193. https://doi.org/10.1016/j.tics.2023.08.014.Search in Google Scholar

Freud, Sigmund. 1923. Das Ich und das Es. Leipzig: Internationaler Psychoanalytischer Verlag.10.1097/00005053-192404000-00085Search in Google Scholar

Gast, Volker, Christian Wehmeier & Dirk Vanderbeke. 2023. A register-based study of interior monologue in James Joyce’s Ulysses. Literature 3. 42–65. https://doi.org/10.3390/literature3010004.Search in Google Scholar

Geurts, Bart. 2018. Making sense of self talk. Review of Philosophy and Psychology 9. 271–285.10.1007/s13164-017-0375-ySearch in Google Scholar

Geurts, Bart. 2019. Communication as commitment sharing: Speech acts, implicatures, common ground. Theoretical Linguistics 45. 1–30.10.1515/tl-2019-0001Search in Google Scholar

Gibson, Alan St. Clair & Carl Foster. 2007. The role of self-talk in the awareness of physiological state and physical performance. Sports Medicine 37. 1029–1044.10.2165/00007256-200737120-00003Search in Google Scholar

Ginzburg, Jonathan. 2012. The interactive stance. Meaning for conversation. Oxford: Oxford University Press.10.1093/acprof:oso/9780199697922.001.0001Search in Google Scholar

Goddard, Quinn, Elizabeth Ritter & Martina Wiltschko. 2022. Who am I talking to when I’m talking to myself? A cross-linguistic study. Proceedings of the Annual Conference of the Canadian Linguistic Association. https://cla-acl.ca/actes/actes-2022-proceedings.html.Search in Google Scholar

Haddican, Bill. 2018. The syntax of Basque allocutive clitics. Glossa 3(1). 1–31 https://doi.org/10.5334/gjgl.471.Search in Google Scholar

Haugh, Michael. 2013. Speaker meaning and accountability in interaction. Journal of Pragmatics 48. 41–56. https://doi.org/10.1016/j.pragma.2012.11.009.Search in Google Scholar

Hengeveld, Kees. 1989. Layers and operators in Functional Grammar. Journal of Linguistics 25. 127–157. https://doi.org/10.1017/s0022226700012123.Search in Google Scholar

Hill, Virginia. 2007. Vocatives and the pragmatics–syntax interface. Lingua 117. 2077–2105. https://doi.org/10.1016/j.lingua.2007.01.002.Search in Google Scholar

Holmberg, Anders. 2010. How to refer to yourself when talking to yourself. Newcastle Working Papers in Linguistics 16. 57–65.Search in Google Scholar

James, William. 1892. Psychology: Briefer course. New York: Henry Holt & Company.10.1037/11630-000Search in Google Scholar

Krifka, Manfred. 2015. Bias in commitment space semantics: Declarative questions, negated questions, and question tags. SALT 25. 328–345.10.3765/salt.v25i0.3078Search in Google Scholar

Krifka, Manfred. 2019. Commitments and beyond. Theoretical Linguistics 45. 73–91.10.1515/tl-2019-0006Search in Google Scholar

Krifka, Manfred. 2023a. Layers of assertive clauses: Propositions, judgments, commitments, acts. In Jutta Hartmann & Angelika Wöllstein (eds.), Propositionale Argumente im Sprachvergleich: Theorie und Empirie, 115–181. Tübingen: Narr.Search in Google Scholar

Krifka, Manfred. 2023b. Linguistik des Selbstgesprächs, mit Evidenz aus Schnitzlers “Leutnant Gustl”. Grazer Linguistische Studien 94. 343–365.Search in Google Scholar

Krifka, Manfred. 2024. Performative updates and the modeling of speech acts. Synthese 203. 31. https://doi.org/10.1007/s11229-023-04359-0.Search in Google Scholar

Krifka, Manfred. 2025. Propositional discourse referents and anaphora in dialogue. Sinn und Bedeutung 29. 807–824.Search in Google Scholar

Latinjak, Alexander T., Alain Morin, Thomas M. Brinthaupt, James Hardy, Antonis Hatzigeorgiadis, Philip C. Kendall, Christopher Neck, Emily J. Oliver, Małgorzata M. Puchalska-Wasyl, Alla V. Tovares & Adam Winsler. 2023. Self-talk: An interdisciplinary review and transdisciplinary model. Review of General Psychology 27. 355–386. https://doi.org/10.1177/10892680231170263.Search in Google Scholar

Le Bruyn, Bert, Martín Fuchs, Martijn van der Klis, Jianan Liu, Chou Mo, Jos Tellings & Henriëtte de Swart. 2022. Parallel corpus research and target language representativeness: The contrastive, typological, and translation mining traditions. Languages 7(3). 176. https://doi.org/10.3390/languages7030176.Search in Google Scholar

Lewis, David. 1980. Index, context, and content. In Stig Kanger & Sven Öhmann (eds.), Philosophy and grammar, 79–100. Dordrecht: Reidel.10.1007/978-94-009-9012-8_6Search in Google Scholar

Miyagawa, Shigeru. 2022. Syntax in the treetops. Cambridge, MA: MIT Press.10.7551/mitpress/14421.001.0001Search in Google Scholar

Peirce, Charles Sanders. 1931. Collected Papers of Charles Sanders Peirce. Cambridge, MA: Harvard University Press.Search in Google Scholar

Ritter, Elizabeth & Martina Wiltschko. 2021. Grammar constrains the way we talk to ourselves. Proceedings of the 2021 CLA Conference. Available at: https://cla-acl.ca/actes/actes-2021-proceedings.html.Search in Google Scholar

Ross, John Robert. 1970. On declarative sentences. In Roderick, Jacobs & Peter, Rosenbaum (eds.), Readings in English transformational grammar, 222–272. Waltham MA: Ginn.Search in Google Scholar

Shusterman, Anna & Elizabeth Spelke. 2005. Language and the development of spatial reasoning. In Peter Carruthers, Stephen Laurence & Stephen Stich (eds.), The innate mind. Structures and contents, 89–106. Oxford: Oxford University Press.10.1093/acprof:oso/9780195179675.003.0006Search in Google Scholar

Speas, Margaret & Carol, Tenny. 2003. Configurational properties of point of view roles. In Anna-Maria Di Sciullo (ed.), Asymmetry in grammar, 315–344. Amsterdam: John Benjamins.10.1075/la.57.15speSearch in Google Scholar

Tuzet, Giovanni. 2006. Responsible for truth? Peirce on judgement and assertion. Cognitio 7. 317–336.Search in Google Scholar

Wiltschko, Martina. 2021. The grammar of interactional language. Cambridge: Cambridge University Press.10.1017/9781108693707Search in Google Scholar

Wunderlich, Dieter. 1976. Studien zur Sprechakttheorie. Frankfurt/M.: Suhrkamp.Search in Google Scholar

Zwicky, Arnold. 1974. Hey, Whatsyourname. CLS 10. 787–901.10.1017/S002222670000400XSearch in Google Scholar

Published Online: 2025-11-11
Published in Print: 2025-10-27

© 2025 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 14.3.2026 from https://www.degruyterbrill.com/document/doi/10.1515/tl-2025-2015/html
Scroll to top button