Home A related-event approach to event integration in Japanese complex predicates: iconicity, frequency, or efficiency?
Article Open Access

A related-event approach to event integration in Japanese complex predicates: iconicity, frequency, or efficiency?

  • Yiting Chen ORCID logo EMAIL logo
Published/Copyright: July 29, 2024

Abstract

Event integration – the conflation of multiple events into a unitary event – plays a vital role in language and cognition. However, the conditions under which event integration occurs in linguistic representation and the differences in how linguistic forms encode complex events remain unclear. This corpus study examines two types of Japanese complex predicates – compound verbs [V1-V2]V and complex predicates consisting of a deverbal compound noun and the light verb suru ‘do’ [[V1-V2]N suru]V – using an original “related-event approach”. Findings indicate that [[V1-V2]N suru]V can be established based on coextensiveness alone, whereas [V1-V2]V typically requires direct or shared causality (“the inevitable co-occurrence constraint”). The related-event approach examines related events of linguistic concepts, such as causes and purposes of an event, identified through “complex sentences” from ultra-large-scale web corpora. This study demonstrates that such an approach is effective in clarifying causal relationships between verbs. Furthermore, this paper contributes to the “iconicity versus frequency” debate by showing that conceptually more accessible events (causality plus coextensiveness) tend to be represented in a simpler form than less accessible events (coextensiveness only), due to “efficiency”. The frequency of usage is a result of the nature of concepts rather than the driving force of coding asymmetries.

1 Introduction

Humans perceive discrete events in continuous streams of observed information (Clewett et al. 2020; Radvansky and Zacks 2017; Zacks et al. 2007). Some events tend to be fused as unified events and expressed by more concise linguistic forms than others, as illustrated in Talmy (2000). For instance, the events “throwing” and “putting in” can be fused into a unified event and expressed as (1a), whereas events like “picking up” and “putting in” cannot be expressed in a single clause, as shown in (1b).

(1)
a.
I threw the ball into the box.
b.
*I picked up the ball into the box.
Intended: ‘I picked up the ball and put it into the box’.

Based on cross-linguistic manifestations of correspondence between concise forms and specific types of events, Talmy argues that the underlying conceptual structure of language involves a certain kind of event complex that he calls a “macro-event”. A macro-event can be conceptualized as composed of two simpler events and the relation between them, but it can also be conceptualized as a unitary event and thus lend itself to representation by a single clause (Talmy 2000: 213).

Since events are basic units of both experience and language, identifying what events can undergo such conceptual integration is an important task in both cognitive science (Barbey and Patterson 2011; Cutting 1981, 2014; Fausey and Boroditsky 2011; Fausey et al. 2010; Kurby and Zacks 2008; Radvansky and Copeland 2000; Radvansky and Zacks 2011, 2017; Zacks and Swallow 2007; Zacks and Tversky 2001; Zacks et al. 2007, 2009) and linguistics (Bohnemeyer and Pederson 2010; Bohnemeyer et al. 2007; Flecken et al. 2014; Gamerschlag 2002; Givón 2001; Haspelmath 2016; Kaufmann 1995; Kaufmann and Wunderlich 1998; Matsumoto 1996; Slobin et al. 2014; Talmy 2000; von Stutterheim et al. 2012). We can understand human cognition better by capturing the regularities of event integration in language.

Despite the large literature on event integration, several critical questions still need to be answered. Talmy (2000: 226) points out that “the precise factors that permit conceptual integration of an event complex for linguistic expression” remain unclear. Another significant theoretical inquiry involves identifying the difference between the events represented by different forms when a language offers multiple ways to express complex events. This is particularly relevant when considering “iconicity”, which assumes an iconic relation between a form and its meaning. These issues will be explored in greater detail in Sections 1.1 to 1.3.

1.1 Previous hypotheses for the conditions of event integration

Several hypotheses have been proposed regarding the conditions for event integration. In the study of philosophy, Lemmon (1967) suggested that events may be identified with space-time zones: if events e1 and e2 are identical, they must take place over the same period of time. However, as Davidson (1969) countered, just because two events happened at the same time and place does not mean they constitute the same event. For instance, an increase in the temperature of a metal ball in a given period and its rotation by 35° in the same period can hardly be considered the same event. Instead, Davidson (1969: 306) argued that “events are identical if and only if they have exactly the same causes and effects”.

In contrast to this emphasis on causality, several hypotheses in linguistics claim that in addition to causality, the concept of “coextensiveness” (multiple events occur at the same time) is also a critical factor in event integration. One of these hypotheses is a constraint called “coherence” (Gamerschlag 2002; Kaufmann 1995; Kaufmann and Wunderlich 1998; Wunderlich 1997).

(2)
COHERENCE
Subevents encoded by the predicates of a decomposed SF [Semantic Form, a highly restricted partial semantic representation] structure must be contemporaneously or causally connected. (Kaufmann and Wunderlich 1998: 6)

A similar constraint is the lexicalization constraint proposed by Matsumoto (1996), shown in (3).

(3)
(i)
Determinative Causation Condition: one of the component events must be the only crucial cause of the other; or
(ii)
Coextensiveness Condition: the main component event must be temporally coextensive with (1) the subordinate component event itself, or (2) its result or effect, or (3) an intention to execute or actualize it.

                     (Matsumoto 1996: 269)

These constraints suggest that event integration is possible when either causality or coextensiveness is satisfied. These hypotheses are referred to as the “Causality or Coextensiveness Hypothesis” (Goldberg’s (2010) more radical claim that neither causality nor coextensiveness is necessary is discussed in Section 4.2).

When two events are causally connected, many languages express them in a more concise linguistic form, such as resultative constructions found in various languages (e.g., drink oneself senseless) and verb–verb “compound verbs” (sequence of two verbs forming a single word) in Japanese (e.g., toke-otiru (melt-fall) ‘melt down’), Chinese (e.g., dǎ-huài (hit-be.broken) ‘hit and break’), Korean (e.g., ccill-e-cwukita (stab-cp-kill) ‘stab to death’), Hindi (khoj nikāl-nā (search take.out-inf) ‘find, discover (after effort)’), etc. The cross-linguistic abundance of such expressions manifests the centrality of causality in the event integration of language.

Cognitive science research has also shown that causality is crucial for event integration. Events occurring at different times are unlikely to be considered part of the same event unless causally related (Radvansky and Zacks 2011). By contrast, causally related pieces of information are more likely to be regarded as the same event (Zacks et al. 2009), and causally related information is more likely to be remembered than non-causally related information (Myers et al. 1987). In cognitive neuroscience research, it is argued that the repeated experience of two causally related events strengthens the connection between them, leading to automatic pattern completion when one component is perceived (Barbey and Patterson 2011). Thus, even if two causally related events are not perceived simultaneously, they are experienced as a single unified event due to their strong, non-accidental co-occurrence.

As shown above, studies in linguistics and cognitive science have shown that causality plays an important role in event integration. The question is whether coextensiveness alone is sufficient for event integration. Since previous hypotheses on event integration in language are based on introspection, proving that two events are genuinely related as these hypotheses suggest is challenging. This study tests the Causality or Coextensiveness Hypothesis using corpus data, specifically investigating Japanese “complex predicates” (predicates consisting of multiple grammatical elements) that satisfy coextensiveness (henceforth “coextensive complex predicates”). If no causal relationship is found between the two events represented by the verbs in a coextensive complex predicate, then coextensiveness alone is sufficient for their integration into a complex event (supporting the Causality or Coextensiveness Hypothesis). Otherwise, causality is necessary for event integration.

1.2 Differences in event integration between different forms

In Japanese, multiple linguistic means are used to express complex events. These include verb–verb compound verbs and “deverbal complex predicates”, which consist of a deverbal compound noun and the light verb suru ‘do’, as exemplified in Table 1.

Table 1:

Examples of two different complex predicates to express complex events in Japanese.

Expressions Examples
Compound verbs: [V1-V2]V tobi-oriru (jump-go.down) ‘jump down’

keri-kowasu (kick-break) ‘break something by kicking it’

hasiri-mawaru (run-go.around) ‘run around’

naki-sakebu (cry-shout) ‘cry and shout’
Deverbal complex predicates: [[V1-V2]N suru]V tere-warai suru (be.embarrassed-laugh do) ‘be embarrassed and laugh’

moti-nige suru (hold-escape do) ‘make off with something’

kogoe-zini suru (freeze-die do) ‘freeze to death’

tati-gui suru (stand-eat do) ‘eat while standing’

In (4a), korogari-otiru is an intransitive compound verb, whereas sagasi-mawaru in (4b) is a transitive compound verb, respectively. Similarly, tere-warai suru in (5a) is an instance of an intransitive deverbal complex predicate, and tati-gui suru in (5b) is a transitive deverbal complex predicate.

(4)
a.
booru ga saka o korogari-otiru.
ball nom hill acc roll-fall
‘A ball rolls down the hill’.
b.
tori ga esa o sagasi-mawat-te-iru.
bird nom food acc search-go.around-te-asp
‘A bird is scavenging for food’.
(5)
a.
gakusei wa tere-warai si-ta.
student top feel.embarrassed-laugh do-pst
‘The student felt embarrassed and laughed’.
b.
watasi wa susi o tati-gui si-ta.
I top sushi acc stand-eat do-pst
‘I ate sushi standing up’.

While some [V1-V2]V and [[V1-V2]N suru]V are interchangeable, many are not and can take only one of the two forms (korogari-otiru (roll-fall) versus *korogari-oti suru (roll-fall do), *tati-kuu (stand-eat) versus tati-gui suru (stand-eat do)). Despite recent studies by Suzuki (2018) and Li (2019) focusing on these two expression types, their differences remain unclear. This paper argues that a key difference is the role of causality.

In compound verbs ([V1-V2]V), V1 takes the renyookei (infinitive) form to combine with V2 directly. This word formation pattern is ubiquitous in Japanese. According to Lieber (1992), the least productive compounds in English are those containing verbs. By contrast, compounds involving verbs are highly productive and widespread in Japanese (Kageyama 2009: 512). For instance, the “web-based database of Japanese compound verbs” (http://csd.ninjal.ac.jp/comp/index.php), developed by the National Institute for Japanese Language and Linguistics, lists 3,757 verb–verb compound verbs.

Deverbal complex predicates [[V1-V2]N suru]V are formed by combining two verbs, V1 and V2, in the renyookei form to create a compound noun, which is then combined with the light verb suru. Suzuki (2018) provides a dictionary-based list of 1,141 Japanese deverbal compound nouns [V1-V2]N (see also Yumoto 2016). Among them, 197 words can be combined with suru to function as verbs (to inflect tense). A complex predicate [[V1-V2]N suru]V consists of three elements (V1, V2, and suru). This high formal complexity and articulation cost differs from those of compound verbs (direct combinations of two verbs), suggesting that V1s and V2s in the form of [[V1-V2]N suru]V will have a different relationship from those in [V1-V2]V.

It is important to examine how the encoding of complex events varies among different linguistic forms, particularly to determine whether the differences in formal complexity arise from variations in the concepts being represented or from their frequency.

1.3 Iconicity, frequency, and efficiency

There is an ongoing “iconicity versus frequency” debate about whether iconicity or frequency motivates “coding asymmetries” – oppositions where one meaning is conveyed in a simpler form and another in a more complex form (Croft 2008; Haiman 2008; Haspelmath 2008a, 2008b). Iconicity refers to the phenomenon where the structure of a language reflects the structure of experience in some way (Croft 2002; Haiman 1980a, 1983, 1985; Jakobson 1965, 1977; Peirce 1974 [1931]; Smith 2002; Van Langendonck 1995, 2007; Waugh 1992). Haspelmath (2008a) challenged the applicability of iconicity in explaining grammatical coding asymmetry, providing numerous examples that contradict the prediction of iconicity.[1] As shown in (6), less marked concepts could be expressed by more marked forms (Haspelmath 2008a: 8).

(6)
less marked / unmarked (more) marked
number plural

Welsh plu ‘feathers’
singular

plu-en ‘feather’
case object case

Godoberi mak’i ‘child’
subject case

mak’i-di (ergative)
person second p. imperative

Latin canta-Ø ‘sing!’
third p. imperative

canta-to ‘let her sing’
gender female

English widow-Ø
male

widow-er
causation causative

German öffnen
noncausative

sich öffnen

Based on the principles of economy proposed by Horn (1921) and Zipf (1935), Haspelmath (2008a) argued that since frequently used expressions are more predictable, this high predictability leads to shortness of coding for reasons of economy. He provided examples of adjectives and abstract nouns, object-marking, possessive constructions, causative constructions, and so on, which are better explained by the Zipfian effect of frequency than by iconicity (cf. Devylder 2018 for an explanation of possessive constructions based on iconicity).

A challenge to the frequency-based explanation is the potential non-causal correlation between frequency and coding asymmetry, which could be a co-occurrence caused by other factors. Haspelmath (2021) argues against a singular factor X that causes both frequency and coding shortness (Figure 1), suggesting instead that frequency causes coding asymmetry.

Figure 1: 
Factor X as a confounder (Haspelmath 2021: 607).
Figure 1:

Factor X as a confounder (Haspelmath 2021: 607).

This paper argues that the nature of concept is the ultimate factor X, leading to both frequency and shortness of coding, via “accessibility” and “efficiency” (Levshina 2018, 2022; Levshina and Moran 2021). Efficiency, crucial for successful communication, is evident in various linguistic phenomena (Du Bois 1985; Grice 1975; Haiman 1983; Zipf 1949). Recent studies by Hawkins (2014), Gibson et al. (2019), and Levshina (2018, 2022) collectively demonstrate that many phenomena can be explained by the unifying principle of efficiency. According to Levshina (2022), efficiency, a product of biological evolution, is an inherent property of living organisms. It means minimization of a cost-to-benefit ratio. Language users save effort in processing and vocalization, and language structure may reflect this tendency (Levshina and Moran 2021).

One principle proposed by Levshina (2022) to achieve communication efficiency is the principle of the negative correlation between accessibility and cost.

(7)
The principle of negative correlation between accessibility and costs:
To be an efficient communicator, language users should spend less effort and time on highly accessible information, and more effort on less accessible information.
(see Levshina 2022: 22)

This idea is also seen in Chafe (1987) and Givón (1995), among others. The “coding efficiency” (high-cost forms are only used for those with low predictability) in Haspelmath (2021) is also a similar concept.

According to Levshina (2022: 18), accessibility refers to the ease with which a mental representation or form can be activated in or retrieved from memory (see also Ariel 1990). The accessibility of a referent depends on various factors, such as the concept it represents and its frequency. Regarding concept-driven accessibility, as noted in Section 1.1, causally related information is more memorable. Therefore, events that are both causal and coextensive are more easily remembered and accessed than those that only exhibit coextensiveness. Expressions that are longer and require more articulation effort and time are typically used for conveying less accessible information. Conversely, shorter forms are generally used for highly accessible concepts. These observations lead to the following hypothesis of this study regarding the coding asymmetry in the context of causality.

(8)
The hypothesis of causal coding asymmetry:
Except for single words, events with a causal link tend to be expressed more concisely than events without such a link.

Haspelmath (2021) argues that predictability, which can be considered as a type of accessibility in this study, and coding efficiency, are factors contributing to coding asymmetry. Importantly, he attributes this predictability or accessibility to frequency. However, this study will demonstrate that it is not frequency-driven accessibility that accounts for the asymmetry in encoding, but rather concept-driven accessibility.

1.4 The aim and the structure of this study

In sum, by examining [V1-V2]V and [[V1-V2]N suru]V, this study aims to answer the following research questions:

  1. What are the conditions placed on “event integration”, the conflation of multiple events into a unitary event, in language?

  2. What are the differences in event integration between different forms?

  3. What drives the coding asymmetry?

To address these questions, this study employs an original method called the “related-event approach”. This approach sheds light on the nature of verbs by examining their related events, such as causes, results, means, and purposes. To clarify the precise factors that permit event integration, events related to V1 and V2 that constitute [V1-V2]V and [[V1-V2] N suru]V were collected by searching for expressions, such as “complex sentences” (see Minami 1974), containing information about related events using an ultra-large-scale Japanese web corpus. Specifically, this paper analyzes whether causality is required in addition to coextensiveness in the events represented by V1 and V2 that make up a complex predicate. The findings show that V1s and V2s in [V1-V2]V typically require both coextensiveness and either direct causality or “shared causality” (indirect causal relation through a common cause or purpose, see Section 2.1). By contrast, many V1s and V2s in [[V1-V2] N suru]V require only coextensiveness.

Furthermore, this paper contributes to the iconicity versus frequency debate by showing that, due to efficiency, conceptually more accessible events (causality plus coextensiveness) tend to be represented in a formally simpler way than less accessible events (coextensiveness only), and that the frequency of usage is a result rather than a driving force.

The paper is organized as follows: Section 2 describes the related-event approach, the data collection method, the procedure to determine the causal relationship, and a Python script developed to facilitate the determination of causality. Section 3 presents the results of this corpus study, highlighting a (negative) correlation between conceptual type and formal complexity, which cannot be explained by frequency. Additionally, this section reveals that frequency is not the underlying cause of coding asymmetry, as evidenced by statistical analysis. Section 4 discusses the posed questions and summarizes the interplay among conceptual type, formal complexity, and frequency, from an efficiency perspective. The paper concludes in Section 5, outlining future potential applications of the related-event approach.

2 Methods

This section outlines the methodology, beginning with an overview of the related-event approach with its application to identifying causal relationships and the corpora used in this research. It then provides a detailed description of the analytical procedure for examining causal relationships in Japanese complex predicates. The procedure is as follows:

  1. Select coextensive complex predicates from sources such as the web-based database of Japanese compound verbs.

  2. Collect the related events of V1 and V2 that constitute these complex predicates from the web corpus.

  3. Determine the causal relationship between V1 and V2 based on the collected related events.

2.1 The related-event approach and its application to identifying causality

To see the relationship between the events represented by V1 and V2, this study focuses on the events related to these verbs. Related events are the various matters in the background of a linguistic concept, which can be considered a form of “encyclopedic knowledge” (Bolinger 1965; Evans 2009; Haiman 1980b; Langacker 1987) linked to that concept. Related events of verbs include the means of causation, the purpose and reason of an agentive action, the cause of a non-agentive action, the manner of a motion,[2] the result and presupposition of an action, and co-occurring events (see Chen 2013; Chen and Matsumoto 2018). Here, related events are the “typical” means, results, manner, etc., which are not entailed by the verb (similar concepts can be seen in Boas’s (2003) “prototypical outcomes” and Washio’s (1997) “conventionally expected results”). For instance, while striking and kicking are possible means of the causative verb break, stroking and licking are unlikely. Noted that striking and kicking are not entailed in the meaning of break.

Previous studies, such as Generative Lexicon (Pustejovsky 1995) and Frame Semantics (Fillmore 1982, 1985; Fillmore and Baker 2010), utilize encyclopedic knowledge of related events. Yet, this event-related knowledge has been used in an incomplete form, as a black box, without clarification of its content. For instance, FrameNet (http://framenet.icsi.berkeley.edu; see Fillmore et al. 2003), an English lexical database from the International Computer Science Institute, uses “frames” (schematic knowledge structures containing encyclopedic knowledge necessary to understand a concept) to describe word meanings. In FrameNet, the verb run evokes the frame Self_motion, and its lexical entry includes core frame elements like Self_mover, Path, and Goal, as well as related events such as Purpose and Manner, considered (non-core) peripheral frame elements (Ruppenhofer et al. 2016: 24). However, FrameNet does not specifically clarify the kinds of information contained in the related events, such as the specific results and purposes connected to a verb. The related event approach proposed in this study complements Frame Semantics by elucidating the contents of these related events, thereby enriching our understanding of frame elements.

Recent research shows that concepts in our brains do not exist in isolation but are connected to related concepts. The theory of situated (grounded) cognition in cognitive science argues that concepts are stored in the context in which they exist (Barsalou 2003; Simmons et al. 2008; Yeh and Barsalou 2006). Furthermore, a study combining neuroscience (specifically, fMRI) and neural networks demonstrates that our brains encode not only the concepts themselves but also the relationships between them (Zhang et al. 2020).

The related-event approach builds on this theoretical foundation, which can be used to analyze various linguistic phenomena based on concepts related to a linguistic concept. By identifying the related events of verbs, it is possible to determine the relationship between events and whether two events can be integrated into a complex event. For instance, in English reflexive resultatives (see Levin and Rappaport Hovav 2004), the sentence Mary ran herself breathless is acceptable, whereas *Mary ran herself full is not. These variations in acceptability are inadequately explained by previous semantic structures that only consider the event denoted by the verb. By linking related events to semantic structures, we can more accurately predict the possible results associated with a verb.

Identifying a direct causal link between two events, such as “running” and “becoming breathless”, is relatively straightforward, even with introspection-based analysis. However, uncovering the connection becomes more complex when two events are indirectly causally related (shared causality). For instance, at first glance, the events of a person “shouting” and “going around” may appear causally unrelated. However, a deeper examination of the purposes associated with each event reveals a common purpose: “to tell”, as illustrated in Figure 2. Two events in a means–purpose relationship are also considered to be causal; as Talmy (2000: 509) states, agentive situations also involve causal chains. The related-event approach offers a more effective way to uncover such “hidden” connections between events.

Figure 2: 
The shared causality (the common purpose “to tell”) between “shouting” and “going around” (only partial information about related events is shown). Since most verb meanings are polysemous, the mere presence of the same related event does not necessarily mean that they share it; two verbs share a certain related event if and only if they are used in the same frame (in this case, the Telling frame).
Figure 2:

The shared causality (the common purpose “to tell”) between “shouting” and “going around” (only partial information about related events is shown). Since most verb meanings are polysemous, the mere presence of the same related event does not necessarily mean that they share it; two verbs share a certain related event if and only if they are used in the same frame (in this case, the Telling frame).

Here, shared causality is defined as common cause type and common purpose type, as illustrated in Figure 3.

Figure 3: 
The shared causality types of Japanese complex predicates.
Figure 3:

The shared causality types of Japanese complex predicates.

When two events, A and B, are positively correlated and neither event causes the other, it is well known that A and B share a common cause (Hitchcock 1998; Reichenbach 1956). Similar to causes, purposes are known to be closely related to event integration. Our actions are inherently purpose-oriented. As long as events share the same purpose, they are recognized as a single event.

[Events] are directed toward a goal; the goal of a wedding is to formalize a union, and the goal of breakfast is to sate one’s hunger. (Zacks et al. 2007: 273)

Conversely, when the purpose of an action changes, event boundaries tend to arise.

Viewers tend to identify event boundaries at points of change in the stimulus, ranging from physical changes, such as changes in the movements of the actors, to conceptual changes, such as changes in goals or causes.

                            (Kurby and Zacks 2008: 72)

Thus, purpose also plays a major role in event integration; multiple events with the same purpose are easily integrated as part of the same event.

Given that shared causality is likely to be limited to common cause and common purpose types, at least in Japanese, this study focuses on these two types. In other words, shared causality here does not include common result and common means types. When two events produce the same result (common result), it is rare for them to occur simultaneously. Involuntary events lack practical reasons to occur at the same time, and voluntary simultaneous events producing the same result fall under the common purpose type, as described above. For instance, “freezing” and “drowning” are involuntary events that may lead to the same result, “dying”. However, without a common cause or a direct causal relationship, freezing and drowning occurring simultaneously is coincidental. Similarly, instances in which a single action (common means) accomplishes two purposes simultaneously (killing two birds with one stone) are quite limited.

2.2 Corpora

To identify the causal relationship between V1 and V2 in Japanese complex predicates, this study collected data from the Japanese Web 2011 corpus (henceforth jaTenTen11, see Srdanović et al. 2013), a Japanese web text corpus comprising more than 8 billion words. The CQL (Corpus Query Language) search function of the web interface Sketch Engine (http://www.sketchengine.eu) was used to search jaTenTen11.

To determine whether the frequency effect can account for the coding asymmetry, the frequencies of both [V1-V2]V and [[V1-V2]N suru]V are examined based on data drawn from the Balanced Corpus of Contemporary Written Japanese (henceforth BCCWJ, http://ccd.ninjal.ac.jp/bccwj/en/index.html) using the Chunagon web interface program. Unlike jaTenTen11, BCCWJ is a balanced corpus (over 100 million words) that can ensure the accuracy of frequency-based studies.

2.3 Selection of research targets

To identify the causal relationship between V1 and V2 in [V1-V2]V, I analyzed 214 coextensive [V1-V2]V selected from the web-based database of Japanese compound verbs. The selection process involves the following steps:

  1. Exclude examples involving rendaku (where the first syllable of V2 changes from unvoiced to voiced), because it is assumed that compound verbs involving rendaku are likely to be derived from verbal compound nouns (see Chen and Matsumoto 2018). As a result, 30 compound verbs, such as kogoe-zinu (freeze-die), were excluded.

  2. Categorize compound verbs not derived from verbal compound nouns as coextensive complex predicates only if they can be rephrased as “V1 nagara V2” (V1 while V2) and if these paraphrased versions are present in jaTenTen11.[3]

Note that V1 nagara V2 must be used in the same sense as [V1-V2]V. The meaning identification is based on Frame Semantics. Specifically, V1 and V2 must be used in the same frame as in the compound predicate and have similar frame element relationships. For instance, in the case of humi-ireru (stamp-put.in) ‘step in’, although its paraphrase humi nagara ireru (stamp while put.in) ‘step on (the gas pedal) and put it in (gear)’ is found in jaTenTen11, they evoke two different frames: the Arriving frame and the Operate_vehicle frame. Thus, humi-ireru was excluded from the analysis (frame identification is based on Ruppenhofer et al. 2016).

To determine the causal relationships between V1 and V2 in [[V1-V2]N suru]V, I analyzed the 29 coextensive complex predicates from 197 deverbal complex predicates based on the list compiled by Suzuki (2018). The identification of coextensive [[V1-V2]N suru]V followed the methodology used for compound verbs, relying on the occurrence of the paraphrase “V1 nagara V2” in the jaTenTen11 to confirm their status as coextensive complex predicates and to ensure that V1 and V2 are used in the same sense (frame) as in [[V1-V2]N suru]V.[4]

2.4 Collection of related events

For the selected coextensive complex predicates, I collected the related events of V1 and V2 to examine their causal relationship. I collected causes for intransitive verbs representing non-agentive actions (cause verbs) and purposes for verbs representing agentive actions (purpose verbs). Note that these two types are not mutually exclusive; some verbs fall into both categories, such as sakebu ‘shout’ and tobu ‘fly’.

To collect events related to V1 and V2, I searched jaTenTen11 using the keywords presented in Tables 2 and 3 (V is the verb to be investigated).[5]

Table 2:

Keywords for collecting causes.a

Keywords Examples
V to V

N ni V

N de V
kogoeru to sinu (freeze then die) ‘die from freezing’

itami ni nai-ta (pain dat cry-pst) ‘cried from pain’

syokku de nai-ta (shock in cry-pst) ‘cried in shock’
  1. aOf the keywords in Tables 2 and 3, “N ni V” and “N de V” were searched for V appearing immediately after the adjacent element. However, because of the relatively small number of “V tame ni V” and “V to V”, V is set to co-occur within five words of its adjacent element.

Table 3:

Keywords for collecting purposes.

Keywords Examples
V tame ni V sagasu tame ni mawaru (search purp go.around) ‘go around to search’

For instance, by using the expression “V tame ni hasiru” and observing what events are presented in the slot of V, we can gather the purposes of hasiru ‘run’, such as “to help someone” (based on the expression tasukeru tame ni hasiru ‘run to help’).

As a result, 186 unique single verbs constitute Japanese coextensive complex predicates, collectively yielding 1,193,808 instances of related events. Due to the limitation of Sketch Engine, the maximum number of instances per verb is capped at 10,000. On average, each verb has 6,418.32 instances of related events.

2.5 Determination of causality and the Related Event Checker

To determine the causal relationships between V1s and V2s, I first examined if items had a direct causal relationship based on the causes collected by “Vcause to Vresult” and the purposes collected by “Vpurpose tame ni Vmeans”. If no direct causality was found, shared causality between V1 and V2 was searched for based on the causes collected by “Ncause ni Vresult” and “Ncause de Vresult”,[6] and/or the purposes collected by “Vpurpose tame ni Vmeans” depending on the types of V1 and V2. If there was at least one instance indicating a causal relationship, I identified the combination V1-V2 as causal.[7]

Similar to the selection of complex predicates, identifying causal relations requires that V1 and V2 be used in the same frame as in the compound predicate and have similar frame element relationships. For instance, in the case of korogari-otiru (roll-fall), rolling is one of the causes of falling based on the expression korogaru to otiru (roll then fall) ‘roll and fall’, where the two verbs evoke the same Motion_directional frame as the compound predicate and have the same frame element relationship. Thus, korogari-otiru has a direct causal relationship (cause–result).[8]

Conversely, kakusi-motu ‘secretly hold’ cannot be classified as a direct causation type based on the expression kakusu tame ni motu (hide purp hold) in (9). This is because the objects of V1 and V2 differ in (9) (V1 kakusu pertains to the belly, V2 motu to the bag), unlike in the compound kakusi-motu, where V1 and V2 must have the same object. From a frame-semantic perspective, both kakusu tame ni motu and kakusi-motu are used in the Hiding_objects frame, but their frame element Hidden_object differs in its interpretation in V2 motu ‘hold’. The assessment of causality in this study was based on such detailed evaluations.

(9)
ookina onaka o kakusu tame ni mot-te-i-ta baggu
big belly obj hide purp hold-te-asp-pst bag
‘the bag I had to hide my big belly’

Other cases that are not considered to be related events are also excluded at the introspection stage. These include annotation errors in the corpus, instances where the verb under investigation is in passive form or part of a compound verb, and incomprehensible translated Japanese. While a wide variety of related events must be excluded, the criteria for exclusion are simple: V1 and V2 must be used in the same frame as in the complex predicate and have similar frame element relationships.

Direct causality is relatively easy to determine manually, as a single search can reveal a possible causal relationship between V1 and V2 (although it is necessary to check the example sentences to actually identify a direct causal relationship). However, due to the vast amount of data, it is challenging to manually check for shared causality. For instance, in tobi-mawaru (fly-go.around), there are already 805 instances of purpose for tobu and 969 for mawaru, let alone instances of cause.

To reduce effort and make this approach feasible, I developed a Python script called “Related Event Checker” (REC). This script uses data on related events to help examine potential causal links between two input verbs, V1 and V2 (see Figure 4). Depending on the types of V1 and V2, REC shows the possible semantic relationships to the user. In cases where the user seeks to explore a direct causal link, REC provides example sentences that depict this relation, enabling the user to verify it. Similarly, if searching for a shared causal relationship, the user can specify a common purpose or cause to access relevant example sentences that help confirm the shared causal relationship between V1 and V2. REC, along with the associated dataset of related events, is available for download (see Data availability statement), providing users with a convenient means to investigate the causal relationships between selected verbs.

Figure 4: 
The process flowchart of REC.
Figure 4:

The process flowchart of REC.

REC is especially important for confirming the absence of a causal relationship between V1 and V2. In such cases, each instance of events related to V1 and V2 needs to be checked, a process that is highly time-consuming and impractical to do manually. Therefore, using a finite but significant amount of data, this study developed a semi-automatic method that makes it practically possible to show that a given relationship between two verbs is unlikely to exist (i.e., approximation of “evidence of absence”).

3 Results

3.1 Results for compound verbs [V1-V2]V

Figure 5 and Table 4 present the results for causal relationships between V1s and V2s in coextensive [V1-V2]V. The data indicate that the majority of coextensive [V1-V2]V (85.98 %) are causally related. For detailed results, see the list of Japanese complex predicates (available for download). The list includes the target complex predicates, their forms, the verbs constituting them, the token frequency of each complex predicate, their causal types, and examples from the corpus that illustrate their causal relationships.

Figure 5: 
Pie chart of the causal relationships in coextensive [V1-V2]V (numbers represent percentages of type frequency).
Figure 5:

Pie chart of the causal relationships in coextensive [V1-V2]V (numbers represent percentages of type frequency).

Table 4:

Types of coextensive [V1-V2]V.

Type Examples
Cause type (9 wordsa):

V1 is the cause of V2
korogari-otiru (roll-fall), koroge-otiru (roll.over-fall), manabi-sodatu (learn-grow), obie-hurueru (be.frightened-tremble), etc.
Means type (30 words):

V1 is the means of V2
nozoki-miru (peek-see), toori-nukeru (go.through-pass), tobi-uturu (fly-move), kogi-susumu (row-advance), etc.
Purpose type (21 words):

V1 is the purpose of V2
uri-aruku (sell-walk), hiroi-aruku (pick.up-walk), uri-mawaru (sell-go.around), sagasi-mawaru (search-go.around), etc.
Common cause type (51 words):

V1 and V2 have a common cause
mai-tiru (dance-disperse), naki-kanasimu (cry-grieve), naki-sakebu (cry-shout), odoroki-awateru (be.surprised-panic), etc.
Common purpose type (73 words):

V1 and V2 have a common purpose
ii-arasou (say-fight), sakebi-mawaru (shout-go.around), utai-odoru (sing-dance), yaki-ageru (burn-fry), etc.
Non-causal type (30 words):

V1 and V2 have no causal relationship
iki-wakareru (live-part), naki-kurasu (cry-live), asobi-kurasu (play-live), nagame-kurasu (view-live), nageki-kurasu (sigh-live), mati-kurasu (wait-live), etc.
Total 214 words
  1. aThe number of words represents the type frequency of verb constructions used in this sense.

The first three types involve a direct causal relationship. The first type of coextensive [V1-V2]V is the cause type (e.g., korogari-otiru (roll-fall) ‘roll and fall’), where V1 (korogaru ‘roll’) expresses the cause of V2 (otiru ‘fall’), as (10) shows.

(10)
korogari-otiru (V1: korogaru , V2: otiru )
korogaru to otiru
roll then fall
‘roll and fall’

The cause type is characterized by V1 (korogaru ‘roll’) causing the result represented by V2 (otiru ‘fall’), with the event of V1 continuing after the occurrence of V2, resulting in both events happening simultaneously.

In the means type, V1 is a means to V2 (11a), whereas in the purpose type, V1 is the purpose of V2 (11b).

(11)
a.
kogi-susumu (V1: kogu , V2: susumu )
susumu tame ni kogu
advance purp row
‘row to advance’
b.
uri-aruku (V1: uru , V2: aruku )
uru tame ni aruku
sell purp walk
‘walk to sell’

In the means type, exemplified by kogi-susumu (row-advance) ‘row forward’, V2 represents a change in state or location, and V1 represents the means to achieve that change. V1 is not an instantaneous action but a continuous action; V1 and V2 occur simultaneously.

In the purpose type, an unergative intransitive motion verb (V2) is compounded with a verb representing both a purpose and an accompanying activity (V1), as in uri-aruku (sell-walk) ‘peddle’.

The fourth type is the common cause type, where V1 and V2 are linked by a shared cause. The fifth type is the common purpose type, characterized by V1 and V2 having a shared purpose. These two types belong to shared causality.

In the common cause type (12), the events represented by V1 and V2 occur simultaneously due to a common cause. As in nayami-kurusimu (be.troubled-suffer) ‘worry and suffer’, V1 and V2 tend to express events that are semantically close.

(12)
nayami-kurusimu (V1: nayamu , V2: kurusimu )
renai ni {nayamu/kurusimu}
love dat {be.troubled/suffer}
‘be troubled by love / suffer from love’

In the common purpose type (13), the events they represent are likely to occur coextensively to achieve the goal efficiently. As in hasiri-mawaru (run-go.around) ‘run around’, V1 and V2 are agentive verbs.

(13)
hasiri-mawaru (V1: hasiru , V2: mawaru )
siraberu tame ni {hasiru/mawaru}
check pur {run/go.around}
‘run to check / go around to check’

Coextensive [V1-V2]V that do not express direct or shared causal relations are limited (30 of 214 words).

3.2 Results of deverbal complex predicates [[V1-V2]N suru]V

Figure 6 and Table 5 present the results for coextensive [[V1-V2]N suru]V. While 14 of them (48.27 %) involve a causal relationship, 15 of those (51.72 %), such as tati-gui suru (stand-eat do) and moti-nige suru (hold-escape do), have no direct or shared causal relationship (see the list of Japanese complex predicates).

Figure 6: 
Pie chart of the causal relationships in coextensive [[V1-V2]N suru]V.
Figure 6:

Pie chart of the causal relationships in coextensive [[V1-V2]N suru]V.

Table 5:

Types of coextensive [[V1-V2]N suru]V.

Type Examples
Means type (2 words):

V1 is the means of V2
nozoki-mi suru (peek-see do), tati-aruki suru (stand-walk do)
Common cause type (3 words):

V1 and V2 have a common cause
tere-warai suru (be.embarrassed-laugh do), kogoe-zini suru (freeze-die do), naki-neiri suru (cry-fall.asleep do)
Common purpose type (9 words):

V1 and V2 have a common purpose
tati-yomi suru (stand-read do), huri-arai suru (shake-wash do), ii-arasoi suru (say-quarrel do), moti-kaeri suru (hold-return do), etc.
Non-causal type (15 words):

V1 and V2 have no causal relationship
daki-ne suru (hold-sleep do), tati-gui suru (stand-eat do), moti-nige suru (hold-escape do), mawasi-nomi suru (turn-drink do), kakusi-dori suru (hide-shoot do), etc.
Total 29 words

The common cause and common purpose types of [[V1-V2]N suru]V are respectively shown in (14) and (15).

(14)
tere-warai suru (V1: tereru , V2: warau )
komento ni {tereru/warau}
comment dat {be.embarrassed/laugh}
‘be embarrassed by the comment / laugh at the comment’
(15)
tati-yomi suru (V1: tatu , V2: yomu )
kakunin suru tame ni {tatu/yomu}
check do pur {stand/read}
‘stand to check / read something to check’

In the non-causal type, there is no causal relationship between V1 and V2. For instance, in daki-ne suru (hold-sleep do) ‘sleep while holding something’, daku ‘hold’ is not the purpose of neru ‘sleep’, nor vice versa. Although daku is identified as one of the causes of neru (daku to neru) in jaTenTen11, examining the example sentences reveals that its meaning is ‘sleep when someone holds in one’s arms’, with different subjects for daku (a parent holds) and neru (a baby sleeps). This differs from daki-ne suru ‘sleep while holding something’, where the subjects of daku and neru are the same.

Regarding shared causality, although REC shows 72 purposes of daku and neru that are identical, thorough examination shows that there is no common purpose that can be used in the same sense as daki-ne suru (the common cause type is not examined because daku is not a non-agentive verb). All instances classified as non-causal type were confirmed through exhaustive examination.

3.3 Results on the coding asymmetry

Building on the classifications in Sections 3.1 and 3.2, we can divide [V1-V2]V and [[V1-V2]N suru]V into two conceptual types: causality plus coextensiveness (coextensive complex predicates with direct or shared causality) versus coextensiveness only (non-causal coextensive complex predicates). This distinction is crucial when considering the coding asymmetry.

Table 6 and Figure 7 show the results for the type frequencies of [V1-V2]V and [[V1-V2]N suru]V. Haspelmath’s frequency hypothesis predicts that simpler forms have a higher frequency. This hypothesis holds if we do not distinguish between types of events: the simpler form [V1-V2]V is more frequent than [[V1-V2]N suru]V. However, when dividing events into two types based on the conceptual relationship between V1 and V2, we observe that events with higher conceptual accessibility (causality plus coextensiveness) are represented by the simpler form [V1-V2]V significantly more often than those with coextensiveness only (Fisher’s exact test: p < 0.001, odds ratio = 6.49758 (95 % CI = 2.6–16.2)). The frequency effect cannot explain this result.

Table 6:

Type frequency of [V1-V2]V and [[V1-V2]N suru]V (the numbers in the parenthesis represent the expected frequency).a

Form
[V1-V2]V [[V1-V2]N suru]V Total
Meaning causality & coextensiveness 184 (174.37) 14 (23.63) 198
coextensiveness only 30 (39.63) 15 (5.37) 45
Total 214 29 243
  1. aAs a reviewer points out, the bottom right cell shows the most statistically significant deviation from the expected number (residual = 4.16), indicating a stronger association with the data (the same is true for Table 7). Still, this is consistent with the claim of this study that conceptually more accessible events tend to be represented in a formally simpler way than less accessible events.

Figure 7: 
Bar chart illustrating the type frequency of [V1-V2]V and [[V1-V2]N suru]V.
Figure 7:

Bar chart illustrating the type frequency of [V1-V2]V and [[V1-V2]N suru]V.

The token frequencies for each type are shown in Table 7 and Figure 8. Similarly, events with higher conceptual accessibility (causality plus coextensiveness) are expressed in the simpler form [V1-V2]V significantly more than events with only coextensiveness (Fisher’s exact test: p < 0.001, odds ratio = 7.883283 (95 % CI = 6.2–9.9)).

Table 7:

Token frequency of [V1-V2]V and [[V1-V2]N suru]V.

Form
[V1-V2]V [[V1-V2]N suru]V Total
Meaning causality & coextensiveness 14327 (14232.48) 279 (373.52) 14606
coextensiveness only 762 (856.52) 117 (22.48) 879
Total 15089 396 15485
Figure 8: 
Bar chart illustrating the token frequency of [V1-V2]V and [[V1-V2]N suru]V.
Figure 8:

Bar chart illustrating the token frequency of [V1-V2]V and [[V1-V2]N suru]V.

Next, I examined the token frequency for each verb. The results were divided into four groups based on levels of conceptual accessibility and formal complexity, excluding the top and bottom 10 % as outliers. The mean values for these groups are listed in Table 8. As shown in Figure 9, there is an interaction between conceptual accessibility and formal complexity.

Table 8:

The mean of token frequency of individual [V1-V2]V and [[V1-V2]N suru]V.

Form
[V1-V2]V [[V1-V2]N suru]V
Meaning causality & coextensiveness Group A mean = 30.98 Group B mean = 16.33
coextensiveness only Group C mean = 14.96 Group D mean = 6.23
Figure 9: 
Interaction plot of the individual mean of [V1-V2]V and [[V1-V2]N suru]V.
Figure 9:

Interaction plot of the individual mean of [V1-V2]V and [[V1-V2]N suru]V.

To confirm the effect of frequency, let us first compare the token frequency of each [V1-V2]V (groups A and C in Table 8) with the token frequency of each [[V1-V2]N suru]V (groups B and D in Table 8). Token frequencies were log-transformed, and complex predicates with zero frequency were excluded to approximate a normal distribution.[9] The results show that [V1-V2]V is significantly more frequent than [[V1-V2]N suru]V (Welch’s t-test: t = 2.922, df = 35.618, p = 0.006004, Cohen’s d = 0.4821557).[10] Additionally, the Point-Biserial Correlation Coefficient (useful for data consisting of a binary variable and a continuous variable) indicates a weak negative correlation (r pb  = −0.156) between form (0 for less complex [V1-V2]V, 1 for more complex [[V1-V2]N suru]V) and log-transformed frequency, which is statistically significant (p = 0.040). Figure 10 is a boxplot of the log-transformed token frequency of different forms, with colored dots and triangles representing individual tokens.

Figure 10: 
Box plot of the token frequency of [V1-V2]V and [[V1-V2]N suru]V.
Figure 10:

Box plot of the token frequency of [V1-V2]V and [[V1-V2]N suru]V.

Next, let us compare the token frequencies of [V1-V2]V and [[V1-V2]N suru]V satisfying causality plus coextensiveness (Groups A and B in Table 8) with those satisfying only coextensiveness (Groups C and D in Table 8). The results show that instances in the causality plus coextensiveness group have a significantly higher token frequency than those in the coextensiveness-only group (Welch’s t-test: t = 2.768, df = 73.684, p = 0.007128, Cohen’s d = 0.4150071). Similarly, the token frequencies were log-transformed and zero-frequency words were excluded. The Point-Biserial Correlation Coefficient between the concept (1 for direct/shared causal relationship, 0 for non-causal) and frequency shows a statistically significant, albeit weak positive correlation between the concept and frequency (r pb  = 0.163, p = 0.031). Figure 11 is a boxplot of the token frequency of different types.

Figure 11: 
Box plot of the token frequency of causality plus coextensiveness type and of coextensiveness type.
Figure 11:

Box plot of the token frequency of causality plus coextensiveness type and of coextensiveness type.

As described above, there is a correlation not only between formal complexity and frequency but also between conceptual type and frequency.

3.4 A causal analysis

To elucidate the ultimate cause of coding asymmetry, merely understanding correlations is insufficient. When X correlates with Y, various causal relationships might exist, such as direct causation from X to Y, reverse causation from Y to X, both being influenced by a common factor Z, or X impacting Y via an intermediary Z. Therefore, it is crucial to explore multiple potential causal links. This exploration is facilitated by a statistical technique known as “causal inference”, as outlined by Imbens and Rubin (2015), Pearl (2009), and Spirtes et al. (2000), among others. Causal inference involves statistically estimating the effects of events to identify causal connections from observational/experimental data. It begins by hypothesizing causal relationships and constructing causal models to assess their effects. “Causal discovery”, as discussed by Glymour et al. (2019), refers to the statistical estimation of causal models when the nature of causal relationships is unknown. Increasingly, linguistic studies are adopting causal analysis methods, as evidenced by the works of Baayen et al. (2016), Dellert (2019, 2023), Levshina (2021), and Roberts et al. (2020). This study seeks to contribute to this growing field.

Adopting causal discovery, this study investigates the statistical causal relationships between three variables: conceptual type, formal complexity, and frequency of coextensive complex predicates. Formal complexity and conceptual type are qualitative variables, represented as complex predicate types and causal types, respectively, while frequency is quantitative, based on token frequency. Table 9 displays the first five instances of these variables, where conceptual type and formal complexity are shown alongside frequency data. The order is based on the list of Japanese complex predicates, which means these are the data of korogari-otiru (roll-fall), koroge-otiru (roll.over-fall), manabi-sodatu (learn-grow), nobi-hirogaru (stretch-spread), and nure-hikaru (get.wet-shine).

Table 9:

A fragment of the data for causal discovery.

Conceptual type Formal complexity Frequency
1 0 129
1 0 166
1 0 0
1 0 10
1 0 10

This study applies the Fast Causal Inference (FCI) algorithm (Spirtes et al. 2000) using the causal-learn package in Python (Zheng et al. 2024) to create a Partial Ancestral Graph (PAG) that identifies potential causal relationships. FCI, which uses a set of conditional independence tests to determine the presence and orientation of edges, is particularly useful in situations where there may be hidden confounding. I used the kernel-based conditional independence test (KCIT, see Zhang et al. 2012), which is flexible at handling complex, nonparametric data distributions. To reduce potential causal links, the test was conducted with a relatively stringent significance level set at 0.01.

The FCI algorithm operates in a structured manner. Initially, it constructs an undirected complete graph, often referred to as the “skeleton”. This skeleton is essentially a network where all variables are interconnected through undirected edges, denoted as X ◦−◦ Y. Once the skeleton is established, the algorithm proceeds to refine this network. It uses conditional independence tests to identify “v-structures” – a specific configuration in a graph indicative of a causal relationship. A v-structure is typically identified as a collider pattern, symbolized as X → Z ← Y, indicating that both X and Y causally influence Z. Concurrently, edges that are deemed superfluous (where conditional independence is established) are removed. The final phase involves further orientation of the v-structures. This is guided by orientation rules in Zhang (2008). Any indeterminate edges are reevaluated and oriented where possible, or removed if they cannot be conclusively determined. The result is a directed graph representing inferred causal relationships, with the orientations of edges providing insights into the directionality of these relationships.

In a PAG, edges between nodes represent causal relationships. An arrowed edge indicates directional relationships from the tail variable to the head variable (e.g., X → Y). A bidirectional edge suggests a common cause between the two variables (e.g., X ↔ Y). An undirected edge represents the absence of a causal effect (e.g., X – Y). Additionally, an edge with one or two circles represents uncertainty (e.g., X ◦−◦ Y), indicating that the relationship could be either an arrowhead or a tail.

The generated PAG (Figure 12) illustrates various potential relationships between conceptual type and formal complexity. These include direct causal links (Concept → Form or Concept ← Form), a non-causal link (Concept – Form), or a common cause (Concept ↔ Form). Importantly, it also indicates frequency is not the direct cause of conceptual type or formal complexity (a similar figure was obtained when the token frequencies were log-transformed and words with zero frequency were excluded). Section 4.3 will explore specific causal directions using additional criteria.

Figure 12: 
The PAG of conceptual type (X1), formal complexity (X2), and frequency (X3).
Figure 12:

The PAG of conceptual type (X1), formal complexity (X2), and frequency (X3).

4 Discussion

4.1 The constraint on event integration

Considering the first research question, which conditions enable event integration, the findings of this study suggest that causality is a crucial factor in encoding event integration. In [V1-V2]V, where two verbs are combined directly, two events cannot be integrated as complex events simply by occurring simultaneously; direct or shared causality is required in addition to coextensiveness in many cases. Conversely, in the more complex form [[V1-V2]N suru]V, many can be established without causality.

There are a small number of [V1-V2]V for which there is no causal relationship. At first blush, these examples appear to conflict with the principle of negative correlation between accessibility and costs. However, a closer examination of these non-causal coextensive [V1-V2]V reveals a distinctive type in which V1 or V2 represents the default activity of an organism, such as ikiru ‘live’, kurasu ‘live’, or mureru ‘swarm’. Examples include iki-wakareru (live-part) ‘get separated while alive’, nagame-kurasu (view-live) ‘spend one’s days gazing’, and mure-tobu (swarm-fly) ‘fly in flocks’ (11 words in total). The co-occurrence of such a default activity with the action represented by the other verb is not accidental but inevitable. This inevitable co-occurrence of the two events allows them to be stored as a single unit, making them more accessible.

As demonstrated above, except for the default activity type, compound verbs basically cannot exist solely based on coextensiveness; they require a causal relationship. The present study supports the claim of Chen and Matsumoto (2018) that the two conditions, causality and coextensiveness, are not independent but are related, as follows:

(16)
The inevitable co-occurrence constraint:
When two events (E1 and E2) have a direct or shared causal relationship and co-occur in close temporal proximity, E1 and E2 can be integrated as a single complex event and established as a lexical compound verb in Japanese.

While this constraint for Japanese lexical compound verbs was based on introspection, this corpus-based study substantiates it. This constraint can account for instances where coextensiveness is satisfied but cannot be established as a compound verb, as in *warai-aruku (laugh-walk). It also explains why events with shared causal relationships can be established as compound verbs, as in ikari-kanasimu (get.angry-grieve) and aruki-mawaru (walk-go.around).

The constraint described above is a necessary but not a sufficient condition. Thus, for instance, there are no compound verbs such as *kiki-hasiru (listen-run) even if the two verbs have a common cause or purpose, as shown in (17).

(17)
tanosimu tame ni {kiku/hasiru}
enjoy purp {listen/run}
‘listen for fun / run for fun’

4.2 Differences in event integration in different forms

Regarding how various linguistic forms differ in event integration, the results show that morphologically simpler [V1-V2]V compound verbs often express more accessible concepts (causality plus coextensiveness) due to efficiency. Conversely, morphologically more complex [[V1-V2]N suru]V deverbal complex predicates tend to express less accessible concepts (coextensiveness only).

Both [V1-V2]V and [[V1-V2]N suru]V are combinations of multiple elements, but is there a difference in the encoding of event integration between single and multiple elements? Previous studies have suggested constraints on event integration concerning single verbs. For instance, Croft (1991: 160) argues that “individual lexical items appear to denote only causally linked events”. The causal order hypothesis in Croft (2012) further elaborates this claim.

(18)
Causal Order Hypothesis:

A simple verb in an argument structure construction construes the relationships among participants in the event it denotes as forming a directed, acyclic, and nonbranching causal chain. (Croft 2012: 221)

Since this claim is for simple verbs, it is unclear whether compound verbs also fall under the scope of this hypothesis. This study shows that causality is also essential for compound verbs. The question arises as to whether temporal closeness is necessary, which is not included in Croft’s constraints. Consider the example of the Japanese compound verb tataki-kowasu (hit-break). If Taro hits a laptop and breaks it on the spot, it can be said without any problem that “Taro ga pasokon o tataki-kowasi-ta” (Taro nom laptop acc hit-break-pst) ‘Taro hit and broke a laptop’. However, if Taro hits a laptop, and it appears fine initially but breaks a month later, it is difficult to say “Taro ga pasokon o tataki-kowasi-ta”. Instead, we use indirect expressions such as (19) to denote temporally distant causal relationships. Thus, temporal closeness appears to be an essential element in event integration in Japanese compound verbs.

(19)
Taro ga pasokon o tatai -ta sei de pasokon ga koware -da
Taro nom laptop acc hit-pst cause by laptop nom be.broken-pst
‘The laptop broke because Taro hit it.’

Another constraint for single (simple) verbs is the conventional frame constraint proposed by Goldberg (2010).

(20)
Conventional Frame constraint:

For a situation to be labeled by a verb, the situation or experience may be hypothetical or historical and need not be directly experienced, but it is necessary that the situation or experience evoke a cultural unit that is familiar and relevant to those who use the word. (Goldberg 2010: 50)

Contrary to the results of the present study, Goldberg (2010) argued that a verb can designate subevents that are not causally related if the represented events can constitute a coherent semantic frame (what the verb means or evokes). For instance, in verbs such as return, since returning from a place presupposes that the place is one you have traveled to before, the subsequent return is not caused by the previous move (Goldberg 2010: 40).

Goldberg argues that many single verbs expressing complex events can be established without causality. However, this fact seems to be limited to single verbs. The results of this study show that when two verbs are combined as a compound verb to express a complex event, inevitable co-occurrence based on causality is a crucial factor. This difference stems from the fact that while the case of a single verb answers a question of what dynamic concept can be labeled as a word (verb), the case of a complex predicate answers a different question of under what conditions two dynamic events, which are expressed by the units of material for synthesis that already exist (e.g., the single verbs korogaru ‘roll’ and otiru ‘fall’ that constitute the compound verb korogari-otiru), can be integrated into a complex event.

Of course, as mentioned earlier, some compound verbs are not causally related, but these are limited (14.02 %).[11] Therefore, we should assume that the overall tendency of compound verbs is to require inevitable co-occurrence based on causality. In a cognitive linguistic view, as in the prototype theory (Geeraerts 1989; Rosch 1973; Taylor 1989), although counterexamples must undoubtedly be considered, more emphasis should be placed on the overall trend rather than on a universal statement (see also Stefanowitsch 2020: 68–76).

4.3 The cause of coding asymmetry in Japanese complex predicates

Regarding the third research question, “What drives the coding asymmetry?” the result shown in Figure 12 contradicts the prediction based on frequency. While frequency correlates with both conceptual type and formal complexity, these correlations do not imply causation. What then is the ultimate cause of coding asymmetry?

Let us further analyze the results using background knowledge in terms of motivational directionality (for the importance of incorporating background knowledge into causal analysis, see Spirtes et al. 2000: 93). First, regarding the motivational directionality between conceptual type and frequency, it is evident that increases in frequency do not alter conceptual type because conceptual type is inherently based on the nature of the concepts represented by V1 and V2. However, it is reasonable to assume that certain types of concepts are used more frequently because they are more accessible or familiar to us (Concept → Accessibility → Frequency). For instance, Schwartz et al.’s (2013) large-scale language and personality survey of Facebook users found that language variations are significantly influenced by personality types, genders, and ages (e.g., extroverts tended to use social-oriented words such as party and love you, whereas introverts frequently used words associated with solitary pursuits, such as computer and reading).

Next, to explore the motivational directionality between frequency and formal complexity, this study further examined 98 pairs of V1 and V2 capable of forming both [V1-V2]V and [[V1-V2] N suru]V, such as moti-hakobu (hold-carry) and moti-hakobi suru (hold-carry do), based on Suzuki (2018). Data were collected on (a) the sum of the respective token frequencies of V1 and V2 in the BCCWJ and (b) the proportion of the shorter construction ([V1-V2]V) for these two verbs with this meaning (see the downloadable list of pairwise complex predicates). The frequency-based explanation would predict that these two variables are positively correlated. However, the result of Spearman’s rank correlation coefficient, suitable for nonparametric data, indicated a very weak and not statistically significant negative correlation between these two variables (ρ = −0.0612, p = 0.549). An FCI analysis with the settings described in Section 3.4 also revealed no direct causal relationship between form and frequency. This finding is consistent with the results presented in Section 3.4.

Considering indirect causal relationships, previous research has shown that high frequency motivates accessibility (e.g., word frequency effect, see Brysbaert et al. 2018). This frequency-driven accessibility, supported by coding efficiency, can lead to a negative correlation with formal complexity.

Putting these observations together with the results obtained in Section 3, the interrelationship diagram shown in Figure 13 illustrates a potential scenario.

Figure 13: 
The interrelationship between conceptual type, formal complexity, and frequency.
Figure 13:

The interrelationship between conceptual type, formal complexity, and frequency.

In this scenario, accessibility correlates with formal complexity due to coding efficiency. Other correlations exhibit more complex relationships. The weak correlation between concept and frequency is caused by conceptual type via concept-driven accessibility. The weak correlation between formal complexity and frequency can be established through two distinct mechanisms. First, it can arise from frequency, through frequency-driven accessibility and coding efficiency. Second, it occurs when the conceptual type motivates concept-driven accessibility, which in turn motivates frequency and correlates with form via coding efficiency.

Importantly, the coding asymmetry, i.e., the correlation between concept and form, is caused by the conceptual type through concept-driven accessibility and coding efficiency, not by frequency. Although frequency-driven accessibility may result in more concise linguistic forms, it is not the underlying cause of coding asymmetry, as it cannot influence the conceptual type. Thus, when examining correlations, motivational directionality must be considered to avoid mistaking mere correlations for causal relationships.

This study shows that form-meaning correlations can emerge independently of iconicity, yet this does not rule out their emergence through iconicity. Coding efficiency and iconicity may operate separately (see Chen 2020, 2023 for examples of iconic linguistic phenomena not accounted for by coding efficiency). Iconicity is also an effective communication method: using iconic structures can be efficient if they are easier to produce and process (Slonimska et al. 2020). Unlike the frequency hypothesis, both iconicity and coding efficiency share the idea that meaning shapes form, indicating that the connection between form and meaning is not arbitrary.

5 Conclusions

Just as a black hole, which cannot be seen directly, is silhouetted against the surrounding glow of the event horizon (The Event Horizon Telescope Collaboration et al. 2019), a word’s meaning can also be anchored indirectly to its related events. By clarifying the relationships between events rather than the events represented by the verbs, this study showed that the condition for event integration of compound verbs is an inevitable co-occurrence based on causality. Direct or shared causality leads to a non-accidental, inevitable co-occurrence of two events, which often undergo event integration in language. Additionally, the study showed that different forms vary in event integration due to efficiency. Conceptually more accessible events tend to be represented as formally simpler than less accessible events. It also suggested that single and compound verbs may have different event integration constraints.

This study also sheds light on the complex interaction between conceptual type, formal complexity, and frequency by considering motivational directionality. Using the universal concepts of causality and coextensiveness, which can be tested objectively by the related events of the verb, this study supports the principle of negative correlation between accessibility and costs. For complex predicates in Japanese, there is a negative correlation between conceptual accessibility and formal complexity (costs), motivated by efficiency. I also demonstrated that frequency is not the cause of conceptual type or formal complexity, as evidenced by the causal analysis.

Nonetheless, some limitations should be noted. First, although [[V1-V2]N suru]V can be formed based on coextensiveness, there are several possible combinations of V1 and V2 that satisfy the condition of coextensiveness but cannot be formed (e.g., *aruki-yomi suru (walk-read do) intended: ‘to read while walking’). Therefore, it is necessary to investigate the other factors enabling the event integration of [[V1-V2]N suru]V. Next, future studies should consider typological differences among languages to explore event integration differences between forms. For instance, Chinese resultative compound verbs (Li 1990) can express a causative event with an intransitive V2, as in dǎ-huài (hit-be.broken). In this case, it is possible to denote a causative event without the agent’s intention, such as xǐ-pò (wash-be.torn) ‘wash something and it gets torn as a result’, where the result is unintentional. By contrast, Japanese compound verbs cannot denote an unintentional and unforeseeable causative event (*arai-yaburu (wash-tear) intended: ‘wash something and make it torn’) because V2 has to be a transitive verb to agree with the subject of V1 (i.e., the principle of subject-sharing, see Matsumoto 1998). This suggests that typological differences may cause discrepancies in event integration. In addition, to confirm whether the principle of negative correlation between accessibility and costs is truly versatile, it is necessary to investigate further whether this principle is also found in other languages and linguistic phenomena, using the related-event approach. Nevertheless, the findings of this paper challenge Haspelmath’s claim that frequency can explain universal coding asymmetries, as it does not hold for Japanese.

The related-event approach proposed in this study, based on events related to linguistic concepts collected in an ultra-large-scale web corpus, ensures reproducibility and objectivity, covering examples difficult to discover through introspection. This approach offers a comprehensive understanding of related events, enabling linguistic analysis that cannot be achieved by conventional methods. Therefore, the related-event approach not only serves as a methodology but also establishes a new research field of linguistic analysis based on “high-resolution event relationships”.


Corresponding author: Yiting Chen, Tokyo University of Agriculture and Technology, Tokyo, Japan, E-mail:

Award Identifier / Grant number: 21K12979

Award Identifier / Grant number: 23H00629

Acknowledgments

A related work of this paper was presented at the 16th International Cognitive Linguistics Conference (ICLC 16) in Düsseldorf. I appreciate the insightful comments from the audience. I am grateful to the anonymous reviewers and the editors for their constructive suggestions on an earlier version of this paper. I would also like to thank Prof. Yo Matsumoto and Prof. Ryoko Uno for their informative feedback on this manuscript. This work was supported by jsps kakenhi Grant Number 24K03888 and 23H00629, as well as the NINJAL project “Evidence-based Theoretical and Typological Linguistics”.

  1. Data availability: The script used in this paper (Related Event Checker), the dataset of related events, the list of Japanese complex predicates (including examples of causal relation and token frequency), and the list of pairwise complex predicates are available via the Open Science Framework and can be retrieved from http://osf.io/7dwa4/.

References

Ariel, Mira. 1990. Accessing noun-phrase antecedents. London: Routledge.Search in Google Scholar

Baayen, Harald R., Petar Milin & Michael Ramscar. 2016. Frequency in lexical processing. Aphasiology 30(11). 1174–1220. https://doi.org/10.1080/02687038.2016.1147767.Search in Google Scholar

Barbey, Aron K. & Richard Patterson. 2011. Architecture of explanatory inference in the human prefrontal cortex. Frontiers in Psychology 2. 162. https://doi.org/10.3389/fpsyg.2011.00162.Search in Google Scholar

Barsalou, Lawrence. 2003. Situated simulation in the human conceptual system. Language and Cognitive Processes 18(5-6). 513–562. https://doi.org/10.1080/769813547.Search in Google Scholar

Boas, Hans C. 2003. A constructional approach to resultatives. Stanford: CSLI Publications.Search in Google Scholar

Bohnemeyer, Jürgen, Nicholas J. Enfield, James Essegbey, Iraide Ibarretxe-Antunano, Sotaro Kita, Friederike Lüpke & Felix K. Ameka. 2007. Principles of event segmentation in language: The case of motion events. Language 83(3). 495–532. https://doi.org/10.1353/lan.2007.0116.Search in Google Scholar

Bohnemeyer, Jürgen & Eric Pederson. 2010. Event representation in language and cognition. Cambridge: Cambridge University Press.10.1017/CBO9780511782039Search in Google Scholar

Bolinger, Dwight. 1965. The atomization of meaning. Language 41(4). 555–573. https://doi.org/10.2307/411524.Search in Google Scholar

Brezina, Vaclav. 2020. Classical monofactorial (parametric and non-parametric) tests. In Magali Paquot & Stefan Th. Gries (eds.), A practical handbook of corpus linguistics, 473–503. Cham: Springer.10.1007/978-3-030-46216-1_20Search in Google Scholar

Brysbaert, Marc & Kevin Diependaele. 2013. Dealing with zero word frequencies: A review of the existing rules of thumb and a suggestion for an evidence-based choice. Behavior Research Methods 45(2). 422–430. https://doi.org/10.3758/s13428-012-0270-5.Search in Google Scholar

Brysbaert, Marc, Paweł Mandera & Emmanuel Keuleers. 2018. The word frequency effect in word processing: An updated review. Current Directions in Psychological Science 27(1). 45–50. https://doi.org/10.1177/0963721417727521.Search in Google Scholar

Chafe, Wallace. 1987. Cognitive constraints on information flow. In Russell Tomlin (ed.), Coherence and grounding in discourse, 21–51. Amsterdam: Benjamins.10.1075/tsl.11.03chaSearch in Google Scholar

Chen, Yiting. 2013. A frame-semantic approach to verb–verb compound verbs in Japanese: A case study of V-toru. In Matthew Faytak, Kelsey Neely, Matthew Goss, Erin Donnelly, Nicholas Baier, Jevon Heath & John Merrill (eds.), Proceedings of the Thirty-Ninth Annual Meeting of the Berkeley Linguistics Society, 16–30. Berkeley, CA: Berkeley Linguistics Society. https://doi.org/10.3765/bls.v39i1.3867.Search in Google Scholar

Chen, Yiting. 2020. Macro-events in verb–verb compounds from the perspective of baseline and elaboration: Iconicity in typology and grammaticalization. Cognitive Semantics 6(1). 1–28. https://doi.org/10.1163/23526416-00601001.Search in Google Scholar

Chen, Yiting. 2023. Baseline and elaboration in word formation. In Fuyin Thomas Li (ed.), Handbook of cognitive semantics, vol. 3, 54–94. Leiden: Brill.Search in Google Scholar

Chen, Yiting & Yo Matsumoto. 2018. Goiteki-hukugoo-doosi no imi to taikei: Konsutorakusyon-keitairon to hureemu-imiron [The semantics and organization of Japanese lexical compound verbs: Construction Morphology and Frame Semantics]. Tokyo: Hituzi Syobo.Search in Google Scholar

Clewett, David, Camille Gasser & Lila Davachi. 2020. Pupil-linked arousal signals track the temporal organization of events in memory. Nature Communications 11(1). 4007. https://doi.org/10.1038/s41467-020-17851-9.Search in Google Scholar

Croft, William. 1991. Syntactic categories and grammatical relations: The cognitive organization of information. Chicago: University of Chicago Press.Search in Google Scholar

Croft, William. 2002. Typology and universals, 2nd edn. Cambridge: Cambridge University Press.10.1017/CBO9780511840579Search in Google Scholar

Croft, William. 2008. On iconicity of distance. Cognitive Linguistics 19(1). 49–57. https://doi.org/10.1515/cog.2008.003.Search in Google Scholar

Croft, William. 2012. Verbs: Aspect and causal structure. Oxford: Oxford University Press.10.1093/acprof:oso/9780199248582.001.0001Search in Google Scholar

Cutting, James E. 1981. Six tenets for event perception. Cognition 10(1-3). 71–78. https://doi.org/10.1016/0010-0277(81)90027-5.Search in Google Scholar

Cutting, James E. 2014. Event segmentation and seven types of narrative discontinuity in popular movies. Acta Psychologica 149. 69–77. https://doi.org/10.1016/j.actpsy.2014.03.003.Search in Google Scholar

Davidson, Donald. 1969. The individuation of events. In Nicholas Rescher (ed.), Essays in honor of Carl G. Hempel, 216–234. Dordrecht: Springer.10.1007/978-94-017-1466-2_11Search in Google Scholar

Dellert, Johannes. 2019. Information-theoretic causal inference of lexical flow. Berlin: Language Science Press.Search in Google Scholar

Dellert, Johannes. 2023. Causal inference of diachronic semantic maps from cross-linguistic synchronic polysemy data. Frontiers in Communication 8. 1288196. https://doi.org/10.3389/fcomm.2023.1288196.Search in Google Scholar

Devylder, Simon. 2018. Diagrammatic iconicity explains asymmetries in Paamese possessive constructions. Cognitive Linguistics 29(2). 313–348. https://doi.org/10.1515/cog-2017-0058.Search in Google Scholar

Du Bois, John W. 1985. Competing motivations. In John Haiman (ed.), Iconicity in syntax, 343–365. Amsterdam: John Benjamins.10.1075/tsl.6.17dubSearch in Google Scholar

Evans, Vyvyan. 2009. How words mean: Lexical concepts, cognitive models, and meaning construction. Oxford: Oxford University Press.10.1093/acprof:oso/9780199234660.001.0001Search in Google Scholar

Fausey, Caitlin M. & Lera Boroditsky. 2011. Who dunnit? Cross-linguistic differences in eye-witness memory. Psychonomic Bulletin & Review 18(1). 150–157. https://doi.org/10.3758/s13423-010-0021-5.Search in Google Scholar

Fausey, Caitlin M., Bria L. Long, Aya Inamori & Lera Boroditsky. 2010. Constructing agency: The role of language. Frontiers in Psychology 1. 162. https://doi.org/10.3389/fpsyg.2010.00162.Search in Google Scholar

Fillmore, Charles J. 1982. Frame semantics. In Linguistics Society of Korea (ed.), Linguistics in the Morning Calm, 111–137. Seoul: Hanshin.Search in Google Scholar

Fillmore, Charles J. 1985. Frames and semantics of understanding. Quaderni di Semantica 6. 222–254.Search in Google Scholar

Fillmore, Charles J. & Colin Baker. 2010. A frames approach to semantic analysis. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of linguistic analysis, 313–340. Oxford: Oxford University Press.10.1093/oxfordhb/9780199544004.013.0013Search in Google Scholar

Fillmore, Charles J., Christopher R. Johnson & Miriam R. L. Petruck. 2003. Background to framenet. International Journal of Lexicography 16(3). 235–250. https://doi.org/10.1093/ijl/16.3.235.Search in Google Scholar

Flecken, Monique, Christiane Von Stutterheim & Mary Carroll. 2014. Grammatical aspect influences motion event perception: Findings from a cross-linguistic non-verbal recognition task. Language and Cognition 6(1). 45–78. https://doi.org/10.1017/langcog.2013.2.Search in Google Scholar

Gamerschlag, Thomas. 2002. Complex predicate formation and argument structure of Japanese VV compounds. Japanese/Korean Linguistics 10. 532–544.Search in Google Scholar

Geeraerts, Dirk. 1989. Prospects and problems of prototype theory. Linguistics 27. 587–612. https://doi.org/10.1515/ling.1989.27.4.587.Search in Google Scholar

Gibson, Edward, Richard Futrell, Steven Piantadosi, Isabelle Dautriche, Kyle Mahowald, Leon Bergen & Roger Levy. 2019. How efficiency shapes human language. Trends in Cognitive Science 23(5). 389–407. https://doi.org/10.1016/j.tics.2019.09.005.Search in Google Scholar

Givón, Talmy. 1995. Markedness as meta-iconicity: Distributional and cognitive correlates of syntactic structure. In Talmy Givón (ed.), Functionalism and grammar, 25–69. Amsterdam: John Benjamins.10.1075/z.74.03marSearch in Google Scholar

Givón, Talmy. 2001. Syntax: An introduction, vol. 2. Amsterdam: Benjamins.10.1075/z.syn2Search in Google Scholar

Glymour, Clark, Kun Zhang & Peter Spirtes. 2019. Review of causal discovery methods based on graphical models. Frontiers in Genetics 10. 524. https://doi.org/10.3389/fgene.2019.00524.Search in Google Scholar

Goldberg, Adele E. 2010. Verbs, constructions and semantic frames. In Malka Rappaport Hovav, Edit Doron & Ivy Sichel (eds.), Syntax, lexical semantics, and event structure, 39–58. Oxford: Oxford University Press.10.1093/acprof:oso/9780199544325.003.0003Search in Google Scholar

Grice, H. Paul. 1975. Logic and conversation. In Peter Cole & Jerry L. Morgan (eds.), Syntax and semantics, vol 3: Speech acts, 41–58. New York: Academic Press.10.1163/9789004368811_003Search in Google Scholar

Haiman, John. 1980a. The iconicity of grammar: Isomorphism and motivation. Language 56(3). 515–540. https://doi.org/10.2307/414448.Search in Google Scholar

Haiman, John. 1980b. Dictionaries and encyclopedias. Lingua 50(4). 329–357. https://doi.org/10.1016/0024-3841(80)90089-3.Search in Google Scholar

Haiman, John. 1983. Iconic and economic motivation. Language 59(4). 781–819. https://doi.org/10.2307/413373.Search in Google Scholar

Haiman, John. 1985. Natural syntax: Iconicity and erosion. Cambridge: Cambridge University Press.Search in Google Scholar

Haiman, John. 2008. In defence of iconicity. Cognitive Linguistics 19(1). 35–48. https://doi.org/10.1515/cog.2008.002.Search in Google Scholar

Haspelmath, Martin. 2008a. Frequency vs. iconicity in explaining grammatical asymmetries. Cognitive Linguistics 19(1). 1–33. https://doi.org/10.1515/cog.2008.001.Search in Google Scholar

Haspelmath, Martin. 2008b. Reply to Haiman and Croft. Cognitive Linguistics 19(1). 59–66. https://doi.org/10.1515/cog.2008.004.Search in Google Scholar

Haspelmath, Martin. 2016. The serial verb construction: Comparative concept and cross-linguistic generalizations. Language and Linguistics 17(3). 291–319. https://doi.org/10.1177/2397002215626895.Search in Google Scholar

Haspelmath, Martin. 2021. Explaining grammatical coding asymmetries: Form–frequency correspondences and predictability. Journal of Linguistics 57(3). 605–633. https://doi.org/10.1017/s0022226720000535.Search in Google Scholar

Hawkins, John. 2014. Cross-linguistic variation and efficiency. Oxford: Oxford University Press.10.1093/acprof:oso/9780199664993.001.0001Search in Google Scholar

Hitchcock, Christopher. 1998. The common cause principle in historical linguistics. Philosophy of Science 65(3). 425–447. https://doi.org/10.1086/392655.Search in Google Scholar

Horn, Wilhelm. 1921. Sprachkörper und sprachfunktion. Berlin: Mayer & Müller.Search in Google Scholar

Imbens, Guibo W. & Donald B. Rubin. 2015. Causal inference for statistics, social, and biomedical sciences: An introduction. Cambridge: Cambridge University Press.10.1017/CBO9781139025751Search in Google Scholar

Jakobson, Roman. 1965. Quest for the essence of language. Diogenes 13(51). 21–37. https://doi.org/10.1177/039219216501305103.Search in Google Scholar

Jakobson, Roman. 1977. A few remarks on Peirce, pathfinder in the science of language. MLN 92(5). 1026–1032. https://doi.org/10.2307/2906890.Search in Google Scholar

Kageyama, Taro. 2009. Isolate: Japanese. In Rochelle Lieber & Pavol Stekauer (eds.), The Oxford handbook of compounding, 512–526. Oxford: Oxford University Press.Search in Google Scholar

Kaufmann, Ingrid. 1995. What is an (im-)possible verb? Restrictions on semantic form and their consequences for argument structure. Folia Linguistica 29(1-2). 67–104. https://doi.org/10.1515/flin.1995.29.1-2.67.Search in Google Scholar

Kaufmann, Ingrid & Dieter Wunderlich. 1998. Cross-linguistic patterns of resultatives. Ms: University of Düsseldorf.Search in Google Scholar

Kurby, Christopher A. & Jeffrey M. Zacks. 2008. Segmentation in the perception and memory of events. Trends in Cognitive Sciences 12(2). 72–79. https://doi.org/10.1016/j.tics.2007.11.004.Search in Google Scholar

Langacker, Ronald W. 1987. Foundations of cognitive grammar: Theoretical prerequisites. Stanford: Stanford University press.Search in Google Scholar

Lemmon, Edward J. 1967. Comments on D. Davidson’s “The logical form of action sentences”. In Nicholas Rescher (ed.), The logic of decision and action, 96–103. Pittsburgh: University of Pittsburgh Press.Search in Google Scholar

Levin, Beth & Malka Rappaport Hovav. 2004. The semantic determinants of argument expression: A view from the English resultative construction. In Jacqueline Guéron & Jacqueline Lecarme (eds.), The syntax of time, 477–494. Cambridge, MA: MIT Press.10.7551/mitpress/6598.003.0020Search in Google Scholar

Levinson, Stephen C. 2000. Presumptive meanings: The theory of generalized conversational implicature. Cambridge, MA: MIT Press.10.7551/mitpress/5526.001.0001Search in Google Scholar

Levshina, Natalia. 2018. Towards a theory of communicative efficiency in human languages. Leipzig: Leipzig University habilitation thesis.Search in Google Scholar

Levshina, Natalia. 2021. Cross-linguistic trade-offs and causal relationships between cues to grammatical subject and object, and the problem of efficiency-related explanations. Frontiers in Psychology 12. 648200. https://doi.org/10.3389/fpsyg.2021.648200.Search in Google Scholar

Levshina, Natalia. 2022. Communicative efficiency: Language structure and use. Cambridge: Cambridge University Press.10.1017/9781108887809Search in Google Scholar

Levshina, Natalia & Steven Moran. 2021. Efficiency in human languages: Corpus evidence for universal principles. Linguistics Vanguard 7(s3). 20200081. https://doi.org/10.1515/lingvan-2020-0081.Search in Google Scholar

Li, Hui. 2019. Gendai-nihongo ni okeru hukugoo-doosi to “V1+V2” gata hukugoo-doomeisi tono imi-keisei no sai ni tuite: Seisansei o tegakari tosite [The difference in the semantics of formation between compound verbs and verbal compound nouns in Japanese]. Tokyo University Linguistic Papers 41. 181–203. https://doi.org/10.15083/00078586.Search in Google Scholar

Li, Yafei. 1990. On V-V compounds in Chinese. Natural Language & Linguistic Theory 8(2). 177–207. https://doi.org/10.1007/bf00208523.Search in Google Scholar

Lieber, Rochelle. 1992. Compounding in English. Rivista di linguistica 4(1). 79–96.Search in Google Scholar

Matsumoto, Yo. 1996. Complex predicates in Japanese: A syntactic and semantic study of the notion ’word’. Stanford, CA & Tokyo: Kurosio Publishers & CSLI.Search in Google Scholar

Matsumoto, Yo. 1998. Nihongo no goiteki-hukugoo-dooshi ni okeru dooshi no kumiawase [Combinatory possibilities in Japanese V-V lexical compounds]. Gengo Kenkyu 114. 37–83. https://doi.org/10.11435/gengo1939.1998.114_37.Search in Google Scholar

Minami, Fujio. 1974. Gendai nihongo no koozoo [The structure of modern Japanese]. Tokyo: Kurosio Publishers.Search in Google Scholar

Myers, Jerome L., Makiko Shinjo & Susan A. Duffy. 1987. Degree of causal relatedness and memory. Journal of Memory and Language 26(4). 453–465. https://doi.org/10.1016/0749-596x(87)90101-x.Search in Google Scholar

Pearl, Judea. 2009. Causality: Models, reasoning, and inference, 2nd edn. Cambridge: Cambridge University Press.10.1017/CBO9780511803161Search in Google Scholar

Peirce, Charles Sanders. 1974 [1931]. The icon, index, and symbol. In Charles Hartshorne & Paul Weiss (eds.), Collected papers of Charles Sanders Peirce, 156–173. Cambridge, MA: Harvard University Press.Search in Google Scholar

Pustejovsky, James. 1995. The generative lexicon. Cambridge, MA: MIT Press.10.7551/mitpress/3225.001.0001Search in Google Scholar

Radvansky, Gabriel A. & David E. Copeland. 2000. Functionality and spatial relations in memory and language. Memory & Cognition 28(6). 987–992. https://doi.org/10.3758/bf03209346.Search in Google Scholar

Radvansky, Gabriel A. & Jeffrey M. Zacks. 2011. Event perception. Wiley Interdisciplinary Reviews: Cognitive Science 2(6). 608–620. https://doi.org/10.1002/wcs.133.Search in Google Scholar

Radvansky, Gabriel A. & Jeffrey M. Zacks. 2017. Event boundaries in memory and cognition. Current Opinion in Behavioral Sciences 17. 133–140. https://doi.org/10.1016/j.cobeha.2017.08.006.Search in Google Scholar

Reichenbach, Hans. 1956. The direction of time. Los Angeles: University of California Press.Search in Google Scholar

Roberts, Seán G., Anton Killin, Angarika Deb, Catherine Sheard, Simon J. Greenhill, Kaius Sinnemäki, José Segovia-Martín, Jonas Nölle, Aleksandrs Berdicevskis, Archie Humphreys-Balkwill, Hannah Little, Christopher Opie, Guillaume Jacques, Lindell Bromham, Peeter Tinits, Robert M. Ross, Sean Lee, Emily Gasser, Jasmine Calladine, Matthew Spike, Stephen Francis Mann, Olena Shcherbakova, Ruth Singer, Shuya Zhang, Antonio Benítez-Burraco, Christian Kliesch, Ewan Thomas-Colquhoun, Hedvig Skirgård, Monica Tamariz, Sam Passmore, Thomas Pellard & Fiona Jordan. 2020. CHIELD: The causal hypotheses in evolutionary linguistics database. Journal of Language Evolution 5(2). 101–120. https://doi.org/10.1093/jole/lzaa001.Search in Google Scholar

Rosch, Eleanor. 1973. Natural categories. Cognitive Psychology 4. 328–350. https://doi.org/10.1016/0010-0285(73)90017-0.Search in Google Scholar

Ruppenhofer, Josef, Michael Ellsworth, Myriam Schwarzer-Petruck, Christopher R. Johnson, Collin F. Baker & Jan Scheffczyk. 2016. FrameNet II: Extended theory and practice. Berkeley, CA: International Computer Science Institute.Search in Google Scholar

Schwartz, H. Andrew, Johannes C. Eichstaedt, Margaret L. Kern, Lukasz Dziurzynski, Stephanie M. Ramones, Megha Agrawal, Achal Shah, Michal Kosinski, David Stillwell, Martin E. P. Seligman & Lyle H. Ungar. 2013. Personality, gender, and age in the language of social media: The open-vocabulary approach. PLoS One 8(9). e73791. https://doi.org/10.1371/journal.pone.0073791.Search in Google Scholar

Simmons, W. Kyle, Stephan B. Hamann, Carla L. Harenski, Xiaopin P. Hu & Lawrence W. Barsalou. 2008. fMRI evidence for word association and situated simulation in conceptual processing. Journal of Physiology Paris 102(1-3). 106–119. https://doi.org/10.1016/j.jphysparis.2008.03.014.Search in Google Scholar

Slobin, Dan I., Iraide Ibarretxe-Antuñano, Anetta Kopecka & Asifa Majid. 2014. Manners of human gait: A crosslinguistic event-naming study. Cognitive Linguistics 25(4). 701–741. https://doi.org/10.1515/cog-2014-0061.Search in Google Scholar

Slonimska, Anita, Asli Özyürek & Olga Capirci. 2020. The role of iconicity and simultaneity for efficient communication: The case of Italian Sign Language (LIS). Cognition 200. 104246. https://doi.org/10.1016/j.cognition.2020.104246.Search in Google Scholar

Smith, Michael B. 2002. The polysemy of German es, iconicity, and the notion of conceptual distance. Cognitive Linguistics 13(1). 67–112. https://doi.org/10.1515/cogl.2002.011.Search in Google Scholar

Spirtes, Peter, Clark Glymour & Richard Scheines. 2000. Causation, prediction, and search, 2nd edn. Cambridge, MA: MIT Press.10.7551/mitpress/1754.001.0001Search in Google Scholar

Srdanović, Irena, Vit Suchomel, Toshinobu Ogiso & Adam Kilgarriff. 2013. Japanese language lexical and grammatical profiling using the web corpus jpTenTen. In Proceeding of the 3rd Japanese corpus linguistics workshop, 229–238.Search in Google Scholar

Stefanowitsch, Anatol. 2020. Corpus linguistics: A guide to the methodology. Berlin: Language Science Press.Search in Google Scholar

Suzuki, Tomomi. 2018. Sahen-doosi o keisei-suru V1+V2 gata hukugoo-meisi: Taiou-suru hukugoo-doosi no umu ni motoduku tigai no kanten kara [Which “verb1+verb2” type of compound noun can form suru -verb? Determining factor being whether it has a corresponding verb or not]. Journal for Japanese Studies 8. 37–49.Search in Google Scholar

Talmy, Leonard. 2000. Toward a cognitive semantics. Cambridge, MA: MIT Press.10.7551/mitpress/6847.001.0001Search in Google Scholar

Taylor, John R. 1989. Linguistic categorization: Prototypes in linguistic theory. Oxford: Clarendon Press.Search in Google Scholar

The Event Horizon Telescope Collaboration. 2019. First M87 Event Horizon Telescope results. I. The shadow of the supermassive black hole. The Astrophysical Journal Letters 875(L1). 17. https://doi.org/10.3847/2041-8213/ab0ec7.Search in Google Scholar

Van Langendonck, Willy. 1995. Categories of word order iconicity. In Marge E. Landsberg (ed.), Syntactic iconicity and linguistic freezes: The human dimension, 79–90. Berlin, New York: De Gruyter Mouton.10.1515/9783110882926.79Search in Google Scholar

Van Langendonck, Willy. 2007. Iconicity. In Dirk Geeraerts & Hubert Cuyckens (eds.), The Oxford handbook of cognitive linguistics, 394–418. Oxford: Oxford University Press.Search in Google Scholar

von Stutterheim, Christiane, Martin Andermann, Mary Carroll, Monique Flecken & Barbara Schmiedtová. 2012. How grammaticized concepts shape event conceptualization in language production: Insights from linguistic analysis, eye tracking data, and memory performance. Linguistics 50(4). 833–867. https://doi.org/10.1515/ling-2012-0026.Search in Google Scholar

Washio, Ryuichi. 1997. Resultatives, compositionality and language variation. Journal of East Asian Linguistics 6. 1–49. https://doi.org/10.1023/a:1008257704110.10.1023/A:1008257704110Search in Google Scholar

Waugh, Linda. 1992. Let’s take the con out of iconicity: Constraints on iconicity in the lexicon. American Journal of Semiotics 9(1). 7–47. https://doi.org/10.5840/ajs19929132.Search in Google Scholar

Wunderlich, Dieter. 1997. Cause and the structure of verbs. Linguistic Inquiry 28(1). 27–68.Search in Google Scholar

Yeh, Wenchi & Lawrence W. Barsalou. 2006. The situated nature of concepts. The American Journal of Psychology 119(3). 349–384. https://doi.org/10.2307/20445349.Search in Google Scholar

Yumoto, Yoko. 2016. Conversion and deverbal compound nouns. In Taro Kageyama & Hideki Kishimoto (eds.), Handbook of Japanese lexicon and word formation, 311–346. Berlin, Boston: De Gruyter Mouton.10.1515/9781614512097-013Search in Google Scholar

Zacks, Jeffrey M., Nicole K. Speer & Jeremy R. Reynolds. 2009. Segmentation in reading and film comprehension. Journal of Experimental Psychology: General 138(2). 307–327. https://doi.org/10.1037/a0015305.Search in Google Scholar

Zacks, Jeffrey M., Nicole K. Speer, Khena M. Swallow, Todd S. Braver & Jeremy R. Reynolds. 2007. Event perception: A mind-brain perspective. Psychological Bulletin 133(2). 273–293. https://doi.org/10.1037/0033-2909.133.2.273.Search in Google Scholar

Zacks, Jeffrey M. & Khena M. Swallow. 2007. Event segmentation. Current Directions in Psychological Science 16(2). 80–84. https://doi.org/10.1111/j.1467-8721.2007.00480.x.Search in Google Scholar

Zacks, Jeffrey M. & Barbara Tversky. 2001. Event structure in perception and conception. Psychological Bulletin 127(1). 3–21. https://doi.org/10.1037/0033-2909.127.1.3.Search in Google Scholar

Zhang, Jiji. 2008. On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artificial Intelligence 172(16-17). 1873–1896. https://doi.org/10.1016/j.artint.2008.08.001.Search in Google Scholar

Zhang, Kun, Jonas Peters, Dominik Janzing & Bernhard Schoelkopf. 2012. Kernel-based conditional independence test and application in causal discovery. arXiv preprint arXiv:1202.3775.Search in Google Scholar

Zhang, Yizhen, Kuan Han, Robert Worth & Zhongming Liu. 2020. Connecting concepts in the brain by mapping cortical representations of semantic relations. Nature Communications 11(1). 1877. https://doi.org/10.1038/s41467-020-15804-w.Search in Google Scholar

Zheng, Yujia, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes & Kun Zhang. 2024. Causal-learn: Causal discovery in Python. Journal of Machine Learning Research 25. 1–7.Search in Google Scholar

Zipf, George. 1935. The psycho-biology of language. Houghton: Mifflin.Search in Google Scholar

Zipf, George. 1949. Human behavior and the principle of least effort. Cambridge, MA: Addison-Wesley.Search in Google Scholar


Supplementary Material

This article contains supplementary material (https://doi.org/10.1515/cog-2023-0041).


Received: 2023-03-24
Accepted: 2024-07-12
Published Online: 2024-07-29
Published in Print: 2024-08-27

© 2024 the author(s), published by De Gruyter, Berlin/Boston

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 14.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/cog-2023-0041/html
Scroll to top button