Baseless derivation: the behavioural reality of derivational paradigms

Maria Copot; Olivier Bonami

doi:10.1515/cog-2023-0018

Article Open Access

Baseless derivation: the behavioural reality of derivational paradigms

Maria Copot and Olivier Bonami

Published/Copyright: February 26, 2024

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information Explore this Subject

From the journal Cognitive Linguistics Volume 35 Issue 2

Abstract

Standard accounts of derivational morphology assume that it is incremental: some words are formed on the basis of others, and each derivational family has a base from which all of the other words are derived. The importance of the base has been questioned by paradigmatic approaches to morphology, which posit that word systems are about multidirectional relationships between words and paradigm cells, in which no word has a privileged status. This paper seeks to test which of these two views makes more accurate predictions about speakers’ cognitive representations of derivational families. We perform an acceptability judgement experiment in which speakers are asked to evaluate the acceptability of a pseudoword conditional on another pseudoword in the same derivational family. We find that speakers are aware of implicative relationships between words in the same family, and that they opportunistically exploit probabilistic relationships between surface words, regardless of whether the base form is the predictor, the target of prediction, or not at all involved in the task.

Keywords: morphology; derivation; Word and Paradigm; experimental linguistics

1 Introduction

Usage-based approaches to language (e.g. Bybee 1985, 1995; Croft 2000; Langacker 1987, 1988, 2000) see the usage event as the root of speakers’ linguistic knowledge. It is from linguistic information embedded in the broader context of the utterance, seen as an inherently communicative act, that speakers extract linguistic representations. These representations evolve over time as the speaker has more experience with the language, and both the strength of the representation and its probabilistic association with contexts and features will guide the speaker’s comprehension and production.

While usage-based approaches are intended as a framework for all levels and aspects of linguistic representation, morphology has received comparatively little attention, as highlighted by Audring (2022), with derivational^[1] morphology in particular proving to be a weak link in the landscape of theories that identify themselves as descendants of the usage-based tradition. Exceptions to this trend include constructional approaches to morphology, starting with early work by Koenig (1994, 1999 and Riehemann (1998), and reaching wider currency as Construction Morphology (Booij 2010, 2013) and Relational Morphology (Jackendoff and Audring 2020). While sharing the philosophical orientation of usage-based approaches by seeking explanatory mechanisms that are ultimately based in language usage, work in these frameworks is largely based on limited empirical evidence, coming primarily from specially selected examples which are seen as illuminating dynamics at play in the system at large. Despite Construction Morphology and Relational Morphology both falling under the broader framework of Cognitive Linguistics, and therefore intended to be rooted in cognitive explanations and to have cognitive plausibility, it is remarkable that the assumptions and predictions that these theories make about the nature of the lexicon have not yet been tested experimentally.

In this paper, we begin to fill this gap by using experimental evidence to test two theoretical claims that together serve to uniquely position constructional approaches to morphology in the landscape of morphological theory: the special role of a base form in derivational word families, and the directed nature of morphological relationships. Section 2 surveys the landscape of theoretical morphology on these two issues, outlining how constructional approaches occupy a hybrid space on the two key issues mentioned above. First, the default assumption on lexical organisation is that morphological relationships are directed, with the base holding a special role. This is a position shared with theories grounded in both the Item and Arrangement and Item and Process traditions (Hockett 1954), including modern incarnations such as Distributed Morphology (Halle and Marantz 1993), as well as a vast array of literature on word formation in the lexeme-based tradition initiated by Aronoff (1976). Second, paradigmatic bidirectional links between non-base forms (Wurzel 1984) may exist in situations that call for them, but are different in nature from ordinary formations. In this way, constructional approaches contrast with paradigm-based theories (e.g. Blevins 2016; Bochner 1993; Bonami and Strnadová 2019; Namer and Hathout 2020), which fully embrace the notion of a dense network of paradigmatic relations, and do not recognize a special role for bases. Interestingly, this last class of theories developed largely independently of the cognitive linguistic community, although they are both usage-based and fully adherent to the cognitive commitment (Lakoff 1990).

Empirical predictions of different families of theories are tested with a modified acceptability judgement task which places emphasis on the relationship between two words in a sentence, outlined in Section 3. The results of this task are presented in Section 4 and discussed in Section 5: speakers appear to keep track of the likelihood of different morphological patterns based on their frequency in the input and conditional on the phonological shape of the stem, suggesting the need for paradigmatic links in morphological theories of word formation. Moreover, speakers do not appear to accord special status to the putative base of word families, beyond that afforded to it by its informativity.

2 Background

This section outlines three theoretical positions on the mental representation of the derivational lexicon. Subsets of proponents of all three have made claims about the cognitive reality of each. The goal of the rest of the paper will be to take an empirical experimental approach to testing the different hypotheses.

2.1 The rooted tree view of morphology

The rooted tree view approaches derivational morphology with the assumption that every lexeme is either underived or relates to a single other lexeme, its base, that is both conceptually prior and formally simpler. As (Stump 2019) highlights, this leads to the view that derivational families are structured as rooted trees. As illustrated in Figure 1 for the family of derive, in such a view, relationships between lexemes are strictly hierarchical (some lexemes are higher up than others in the tree structure), and asymmetric (derived lexemes depend on their formal base and nothing else in the family). This view of derivational morphology was first formalised by Aronoff (1976), but is explicitly or implicitly assumed by the overwhelming majority of work in word formation, or work that uses word formation as a tool to answer other questions about language. Within Cognitive Linguistics, the two main theories staking claims about the structure of the derivational lexicon, Relational Morphology and Construction Morphology, assume that the rooted tree view is the default situation in derivational relationships and accounts for the majority of morphological relationships.

Figure 1:

A rooted tree representation of part of the derivational family of derive. The lexemes shown here are a subset of those documented in the Oxford English Dictionary.

Despite its popularity, the pure rooted tree model of word formation has a number of theoretical and empirical shortcomings, the most salient of which are outlined below.

There are numerous cases where a derived word has multiple potential bases. Rederivation is one such case, it could be the result of adding an agentive -ion suffix to rederive, or of adding a reiterative prefix re- to derivation. Both routes make equally accurate predictions about the form and meaning of the derived word. Ideally, one might like to posit both rederive and derivation as joint formal bases of the derived word,^[2] however this is not permitted by the rooted tree architecture. Decisions about which lexeme should be chosen as formal base in these situations are often somewhat arbitrary, and the constraint to only allow a single incoming edge per word cannot capture the multiple sources of support that the coinage is likely to have received.

Even more flagrantly, there are cases where a derived lexeme’s properties have clearly been taken from different lexemes in its derivational family (Hathout and Namer 2014b): take the triplet language, linguist, linguistic. Formally, linguistic is derived from linguist (not from language, cf. *languagic). Consider other denominal adjectifs in X(t)ic and their formal base: gene ∼ genetic, acid ∼ acidic, artist ∼ artistic show that the pattern is for the derived word in X(t)ic to mean ‘related to X’. Linguistic theory is not a theory about linguists, but about language. So while the word’s form is derived from linguist, its meaning is derived from language. The rooted tree’s property of allowing only one incoming edge per node does not allow for a representation of this situation.

A different but related issue stems from the monodirectional nature of the relations. There is extensive historical evidence for many instances of derivation by backformation.^[3] For instance, many English verbs have been backformed from latinate nouns, such as resurrect ∼ resurrection, project ∼ projection and insert ∼ insertion. Aside from the later attestation date compared to the noun, what gives away the backformed origin of the verb is its phonological form. These verb ∼ noun pairs both share the Latin supine stem, identifiable by the final -t. Non-backformed latinate nouns (satisfy ∼ satisfaction, evoke ∼ evocation) instead use allomorphic stems, reflecting the different stems associated with verbs and deverbal nouns in Latin morphology. This shows that the first set of verbs was formed based on the noun, rather than the other way around.

The monodirectional nature of derivational trees leads to a further set of complications for situations where establishing directionality between two items is unmotivated or impossible – this is known as cross-formation (Becker 1993). A prime example are conversion pairs: march_V and march_N do not feature affixes that signal which one might have come from the other, and their lexical semantics are in a relationship of mutual predictibility. It would be desirable to note the existence of a bidirectional, non-hierarchical^[4] relationship between the two. Some phenomena are simultaneously troublesome for the requirements of directionality and hierarchy: English has many pairs of derived words with the pattern Xism ∼ Xist such as sexism ∼ sexist from the formal base sex, buddhism ∼ buddhist from the formal base Buddha. The two derived words are much more closely linked to each other than they are to their base in both form and meaning: buddhism is a set of beliefs and behaviours historically based on the teachings of the Buddha, but the concept denoted by the derived word is now rather far from that denoted by its base, encompassing specific beliefs and attitudes as well as having associated symbolism and behaviours that the Buddha is not responsible for and might not even endorse or be familiar with (e.g. the Gandharan Buddhism of the Kush empire, a synthesis of buddhist teachings and Greco-Roman, Iranian and Indian elements). A buddhist is someone who espouses buddhism, and buddhism is the total of ideas and practices espoused by buddhists: not only are the two concepts much closer than they are to their base, but the two also share the stem buddh-, further entrenching the contrast between the pair and their base (*buddhaism ∼ *buddhaist, where the absence of the vowel is not due to phonotactic constraints: the string aist/aism exists in dadaism ∼ dadaist). A particularly striking example of the same problem comes from cases of cross-formation where there is no base: optimist ∼ optimism closely resemble each other in form and meaning, and lack a base (*optime) that would enable them to fit into a rooted tree structure.

2.2 Paradigmatic links in derivational morphology

These phenomena put into question the usefulness and accuracy of the rooted tree view. However, their well-known and extensively documented existence has not been met with a desire to reshape morphological theories of word formation: a large part of the morphological community sees them as marginal phenomena that should not drastically change the way we conceive of word formation.

The seminal study of van Marle (1984) was influential in suggesting that these phenomena should not be ignored by theorists, though he still claimed that they were fundamentally different in quality from garden-variety derivation, encoding this in the terminology he chooses to discuss paradigmatic relationships: they give rise to secondary or analogical coinings. This view is currently held by Construction Morphology (Booij 2010), which makes a distinction between first-order and second-order relationships (classical base-centric monodirectional derivation versus paradigmatic relationships) and Relational Morphology (Jackendoff and Audring 2020) which makes the same distinction under the name of daughter versus sister schemas, reflecting a view of the world that is akin to the rooted tree with extra links between leaf nodes. While this tradition essentially accepts the rooted tree view, it allows for mechanisms of connectivity between leaves in order to account for the problematic phenomena outlined above.

Nevertheless, a subset of morphologists has argued for decades that the existence of said problematic phenomena motivates a reconceptualization of derivational morphology as the general study of relatedness between lexemes (Bauer 1997; Becker 1993; Bochner 1993; Bonami and Strnadová 2019; Hathout and Namer 2022; Robins 1959; Štekauer 2014). This amounts to extending to derivation the central tenets of the Word and Paradigm (WP) approach to inflectional morphology (Anderson 1992; Blevins 2016; Hockett 1954; Matthews 1972; Robins 1959; Stump 2001). Matthews (1991, p. 216) cogently summarises these as holding that words are “not wholes composed of simple parts, but are themselves the parts within a complex whole,” namely a paradigm. The paradigmatic approach focuses on systematic relationships between words, and holds these relationships to be bidirectional and non-hierarchical, making them suitable for formalising previously problematic phenomena.

In this paper, we adopt the formalisation of the derivational paradigm proposed by Bonami and Strnadová (2019). Their process for inducing paradigmatic structure is rooted in patterns of language use: they begin by finding pairs of words that are morphologically related, that is, have a relationship of both form and meaning. Then relationships are filtered for systematicity: only recurrent morphological relationships may form the basis of a paradigm. Groups of words that are morphologically related form a morphological family. Multiple morphological families are then aligned on the basis of the semantic contrasts they instantiate (see Bybee (2001) for evidence on the primacy of meaning in lexical relatedness and Štekauer (2014) on its role in paradigmatic organization), leading to a multidimensional paradidmatic structure. The output of this process is illustrated in Figure 2.

Figure 2:

A visualisation of three partial derivational paradigms of English, presented horizontally, with parallelism across families presented vertically.

This conception of derivational paradigms is an instantiation of a broader trend of seeing morphology specifically, but also language in general, as a complex adaptive system (Blevins et al. 2016). The idea that language can be fruitfully modelled as a complex adaptive system or as a dense dynamic network, a rich structure in which constructions enter relationships of multiple kinds with each other based on patterns of usage, is a position that converges with that generally held by cognitive linguistics when applied to domains other than morphology. To give a few examples, Diessel (2019) outlines a general program for a nested model of grammar. In a chapter that offers an overview of concepts and methodologies for usage-based linguistics Gries and Ellis (2015) highlight that much recent work at the intersection of cognition and language is based on associative learning and discuss tools for approaching corpus studies through this lens. Ellis et al. (2016)’s book discusses this in the context of language acquisition and how humans develop flexible representations of their language over time by exploiting relationships of usage between constructions. Sommerer and Smirnova (2020) discuss the same dynamic network approach for modelling diachronic language change.

In a morphological system with these characteristics, the lexicon is a densely connected graph organised along two dimensions: the morphological family and the paradigmatic cell. Within a single morphological family, each lexeme is in a bidirectional relationship with every other member, acknowledging that influences of form and meaning operate in a much more holistic fashion.

Section 2.1 discussed the phenomena that motivated a change of framework for word formation. The next question is whether this shift accurately captures essential aspects of lexical derivation that were disregarded by a rooted tree methodology. A key difference in prediction between the two concerns the way in which information flows. In a rooted tree model, information may only flow from the root outwards, never in the reverse direction, and never between nodes which are not in a base → derivative relationship. In a fully paradigmatic model, information may flow between any two connected nodes, in any direction.

Much recent literature on inflectional morphology focuses on the implicative structure of paradigms (Ackerman and Malouf 2013; Ackerman et al. 2009; Stump and Finkel 2013; Wurzel 1989). Paradigmatic relationships are implicative in nature since they constrain possible relations of form and meaning between paradigmatically related words in a way that can be captured in terms of (probabilistic) conditional statements. Using English inflection as an example, if we know the past form of a verb has the shape Xed, the we can be certain that its present form has the shape X. On the other hand, if we know the present of a verb to be X, then it is highly probable that the past has the shape Xed, but it is not certain. Under a non-paradigmatic view, it is surprising that the past tense is a better predictor of the present (thought to be the unmarked/neutral/base form of the English verbal system) than the present is of the past tense – information should flow from one cell only (the formal base) towards the rest. Under a paradigmatic view, this is expected: a form in any cell is supposed to provide at least some information about the form in any other cell (Ackerman and Malouf 2013) and combining information from more cells will improve prediction towards other cells in the system (Bonami and Beniamine 2016).

Bonami and Strnadová (2019) extend this methodology to derivational morphology. Table 1 documents predictability relations between French verbs and corresponding action nouns and agent nouns; importantly, the verb is the formal base of the triplet in the overwhelming majority of cases in French. The table shows that the form of the action noun is just as easy to predict from the verb as it is from the agent noun, and the action noun is a better predictor of the form of the agent noun than the verb is. This is unexpected under a rooted tree model: the verb should be the best predictor of the other two forms, and it should be hard to predict the verb. The authors additionally show that using multiple predictors is always an improvement over using a single one: this means that the action and agent noun have information about forms of other members of the paradigm that are not available based on the verb alone. It is noteworthy that the easiest prediction in this dataset involved predicting verb from the agent and action nouns – under a rooted tree model, this would be expected to be the hardest prediction. This work shows that, from an information-theoretical perspective, the word forms of derivational families are characterised by implicative relationships of predictability form each form to any other, as would be expected under a paradigmatic perspective.

Table 1:

Implicative entropy for the French (Verb, Action Noun, Agent Noun) paradigmatic system. Implicative entropy H(X → Y) is the conditional entropy of a formal alternation between the shape of words filling two paradigm cells X and Y given the shape of the word filling the predictor cell X (Bonami and Beniamine 2016). Hence it is a measure of average predictability, with lower numbers indicating higher predictability.

H(row → col)	verb	action noun	agent noun
verb	–	1.115	0.709
action noun	0.101	–	0.269
agent noun	0.264	1.114	–

2.3 Evaluating the claim for monodirectedness in derivational families

Many theories of morphology assume directionality in the relationships that link together the lexicon. Theories that fall under the categories Hockett (1954) terms item and arrangement (IA) and item and process (IP) require a root to which material is added or to which a process is applied, making all relationships in this framework inherently directional. Word-based approaches in the tradition of Aronoff (1976) substitute a concrete word for the abstract root, but do so while keeping the directionality assumptions inherent in IA and IP. The recent increase of popularity for word and paradigm theories has afforded morphologists the possibility of questioning the concept of the base, and of the directionality of relationships.

The theoretical literature has begun to grapple with this matter in both inflection and derivation, as discussed in the previous few sections. However, evidence bearing on this matter is often reliant on specific data points and their relationships of form and meaning, rather than on more direct evidence about how humans store and process language. Within morphology, behaviourally-oriented work on the status of the base in inflection is almost equally scarce as work in derivation. A notable exception is to be found in the work of Adam Albright. Albright (2002) proposes explicitly that the base of a paradigm has special primitive status (the Single Base Hypothesis), and that this should be borne out by behavioural studies. Jun and Albright (2017) then tested this prediction explicitly by asking speakers of Korean to judge the acceptability of the base form of a pseudolexeme, using a non-base as conditional information. They calculate two sets of predictability scores between the base and non-base form using the Minimal Generalisation Learner (Albright 2002; Albright and Hayes 2003) (MGL). The MGL is a method for assigning likelihoods to form relationships between two cells, outputing a score for form F1 of word family WF in cell C1, conditional on knowing that WF has form F2 in cell C2. The method relies on the type frequency of morphological patterns, which is derived in a way that takes into account the phonology of the word forms. Jun and Albright (2017) attempt to predict participants’ behavioural responses to non-base ∼ base pairs of words in Korean with forward scores (predicting the non-base form from the base) and backward scores (predicting the base form from the non-base), named after whether the direction of prediction in the experimental task matched the input-output pair presented to the MGL. They find that backward predictability scores explained more variability in participant judgements than forward scores: despite participants predicting the base form from a non-base form, scores that assumed participants were using the base form as a starting point provided a better fit for the data, taking this as evidence in favour of the Single Base Hypothesis. However, evidence showing that speakers are aware of paradigmatic implicative relationships between inflected word forms also exists: in a behavioural study with the goal of investigating speakers’ awareness of and reliance upon paradigmatic relationships, Copot and Bonami (2024) perform an experiment on the French verbal system that parallels Jun and Albright’s in methodology, and find that the cell deemed to be the base appears to have no inherently special status in the system, once the frequency of forms within the cell is accounted for.

There is no parallel explicit claim for the processing of derivational relationships, but the assumption can still be found in countless publications that assume the formally simpler of two morphologically related words must somehow be more basic, and thus in a privileged position with regards to informativity. Consider for example the literature on whether morphologically complex words are parsed as wholes or segmented into morphemes. In addition to being historically IA/IP-informed, this literature assumes that suitable evidence for decomposition is yielded by showing priming for a morphologically complex word from its formal base. Prominent examples are Levelt et al. (1999), Rastle et al. (2004) and Christiansen and Chater (2016), but most work in this tradition shows the same implicit belief that there is such a thing as the base of a derived word.

The cognitive linguistics literature is also aware of directionality as a parameter: the move towards associative learning has been one away from directional relationships, as discussed explicitly in e.g. Gries (2013), and towards symmetrical ones, defined by patterns of occurrence and co-occurrence. Despite this, theories of morphology aligned with cognitive linguistics have largely borrowed the assumption that relationships between constructions are directional, unless otherwise specified.

In this paper, we present an experiment that addresses the status of the base within derivational paradigms for language speakers.

2.4 Research questions

This paper wishes to test some of the main diverging assumptions made about the structure of the derivational lexicon through a behavioural experiment. In the background, we outlined three main views: the rooted tree view (the lexicon is a series of star graphs made up of monodirected relationships), the paradigmatic view (the lexicon is a densely connected network of bidirectional associations) and the cognitive morphology view (the rooted tree is the default situation, though some leaf nodes might be connected with each other). These three sample positions make different claims about how humans store and process derivationally related lexemes. All three approaches expect speakers to be aware and keep track of repeated patterns in the lexicon and their likelihood. However, they differ in which relationships they suggest speakers are keeping track of.

We test this with an acceptability judgement task asking speakers to judge a novel word form conditional on knowledge of another member of its paradigm, inspired by the methodology of Jun and Albright (2017) for inflectional morphology. Jun and Albright observed that in inflection, if the form of the word to be judged was very predictable conditional on the form of another member of its paradigm, the word would be rated higher than if it was unpredictable. The first claim we wish to test is whether form predictability between derivational forms correlates with speaker behaviour, in the same way it does for inflectionally related forms. A positive correlation between the predictability of this form from its paradigm mate (i.e. a word belonging to the same paradigm) and speaker judgement of the form would constitute evidence that speakers are aware of and exploit the implicative structure between derivational forms.

Paradigmatic approaches make two predictions that diverge from those of the rooted tree view. Firstly, they predict that speakers will show awareness of implicative relations even when making predictions between two non-base forms. In this setting, a rooted tree approach would predict that speakers are making the prediction in multiple steps, which must necessarily pass through the prediction of a base form. If speaker judgement correlates positively with form predictability even when two non-base forms are involved, this constitutes evidence for the paradigmatic view. Secondly, rooted tree approaches claim that all predictions are made from the base. Jun and Albright (2017) make this claim from a cognitive perspective for inflection explicitly. A paradigmatic approach does not give the base any inherently special status. Therefore the different views make differing hypotheses about how speakers perform predictions from a non-base form towards a base form: the paradigmatic view predicts that this will be a prediction like any other, while the rooted tree view predicts that what speakers are doing when processing a word form is to evaluate the predictability of the non-base form from the base, even when they are asked to perform the prediction in the opposite direction. As Relational Morphology and Construction Morphology decide on a case by case basis whether two series stand in a paradigmatic relationships or not, this family of approaches predicts that speakers have awareness of implicative relationships between series at least some of the time, but do not make a priori predictions about which series this applies to. Therefore while observing speaker behaviour that matches paradigmatic approaches to morphology is compatible with Relational Morphology and Construction Morphology, so is observing speaker behaviour that matches the rooted tree predictions.^[5]

3 Methodology

3.1 Items

The experiment is based on French data. For each trial, participants were presented with a sentence containing two morphologically related pseudowords. For crucial items, the two pseudowords were in a derivational relationship, while for distractors, the two pseudowords were in an inflectional relationship.

As illustrated in Figure 3, items consisted of the combination of a sentence frame (j’adore le monde de la ___. Je veux être ___quand je serai grand) and a pair of pseudowords filling that frame (catonisation-catonisateur/catonisiteur/catoniseur). The same sentence could be filled by pseudowords differing in the level of predictability of the second form conditional on the first – this was quantified thanks to the Minimal Generalisation Learner, outlined below. Only one of the choices in the figure would be shown to any given participant.

Figure 3:

Sample experimental item, followed by an English translation. Only one of the forms filling the second slot is presented to each participant. The two crucial words are pseudolexemes. The three possible forms in the second sentence have different levels of predictability conditional on the knowledge that their action noun is catonisation – catonisateur would be the most expected agent noun based on the frequency of form patterns between action nouns ending in -ation and their corresponding agent noun in French, followed by catonisiteur, and lastly by catoniseur.

To operationalise the predictability of one word form conditional on another, we chose Minimal Generalisation Learner (MGL) scores (Albright 2002), due to their wide adoption in the morphological literature (and especially in Jun and Albright (2017), a study that parallels the current one in many respects, bearing on the structure of the inflectional lexicon). The MGL outputs how likely a word form is in cell C1 given a related form of the same lexeme or morphological family in cell C2. In the case above, this would be the probability that e.g. catonisateur is the agent nouns of the action noun catonisation). In order to produce this score, the MGL is trained on several pairs of phonologically transcribed word forms instantiating two different paradigm cells (for example, all agent noun ∼ action noun pairs in the French language). From this, the MGL extracts all possible form mappings between C1 and C2, using the phonological representations to both specify conditions in which each mapping can apply, and to create more generalised versions of sets of mappings whenever possible (for example, it might extract that one possible pattern between action and agent nouns in French is Xation ∼ Xateur). Once trained, form pairs can be supplied to the MGL, which will output a score for how likely form F1 is in cell C1, given that the lexeme or morphological family has form F2 in cell C2. This score is the adjusted probability that the mapping that holds between F1 and F2 applies to an input with the phonology of F2. At its core, the MGL score represents the type frequency of a morphological pattern (a variable whose impact on human linguistic behaviour is well documented, e.g. Berko Gleason (1958) and all subsequent literature on wug tasks) adjusted for the phonology of the input.

Six directed pairs of cells were chosen for the experiment (Table 2). The choice of cells is based on previous work on identifying derivational paradigms in French by Bonami and Strnadová (2019). The verb, present in the items only in the infinitive form, is assumed to be the base of both action nouns, and masculine agent deverbal nouns by traditional accounts (Figure 4). For readers who might not speak French, the French triplets {verb, action noun, masculine agent noun} have fairly close parallels in English triplets such as {banish, banishment, banisher} or {abolish, abolition, abolitioner}, in which similar complex and variable formal dependencies can be seen.

Table 2:

The six directed cell pairs of interest.

Predictor → Target
verb → agent noun
agent noun → verb
verb → action noun
action noun → verb
agent noun → action noun
action noun → agent noun

Figure 4:

The rooted tree (a) versus the paradigmatic (b) view of the links between the three chosen cells.

Within each of the six item conditions, we identified three morphological patterns of differing levels of predictability, illustrated in Table 3, and which we now describe briefly.

verb → action noun . For prediction of action nouns for verbs, we focus on second conjugation verbs (infinitives in -ir) and their compatibility with different nominal suffixes. These verbs are most often associated with an action noun in -ment, instantiating the pattern Xir ∼ Xissement. In a sizeable minority of cases where -age is used instead and the pattern is Xir ∼ Xissage. Finally, a handful of morphological families match a second conjugation verb with a suffixed noun in -ion, leading to the pattern Xir ∼ Xition.
action noun → verb . We focus on items in -issement. Most of these match a second conjugation verb in -ir (pattern Xissement ∼ Xir), but some match a first conjugation verb (pattern Xissement ∼ Xisser). Finally, although this is not attested, it is conceivable to have a first conjugation verb in -er, dropping the -iss- sequence (pattern Xissement ∼ Xer).
verb → agent noun . For prediction of agent nouns from verbs, we look to regular first conjugation verbs in -er, and build on the fact that an agent noun in -eur can either be formed directly on the basic stem (pattern Xer ∼ Xeur) or on a learnèd stem, that will most often end in -at- (pattern Xer ∼ Xateur); and that the former situation is more prevalent than the latter. Additionally, a handful of agent nouns use instead the suffix -aire (pattern Xer ∼ Xataire), which mostly forms nouns of other semantic types, and is generally specialized to legal vocabulary unused by ordinary speakers.^[6]
agent noun → verb . We look to nouns ending in -ateur. Most of these are build from a learnèd stem in -at-, and hence the expectation is for the corresponding verb to not contain the -at- sequence (pattern Xateur ∼ Xer). In a sizable minority of cases however, the -at- sequence is part of the basic stem, and hence present in both noun and verb (pattern Xateur ∼ Xater). For a most unexpected situation, we look at the unattested but clearly possible situation of a noun with a basic stem ending in -at- and a corresponding second conjugation verb in -ir (pattern Xateur ∼ Xatir).
action noun → agent noun . We focus on predictors in -ation, and build on the strong tendency for morphological families with an action noun based on a learnèd stem to use that same learnèd stem in the agent noun (pattern Xation∼Xateur). A less common but attested situation is for the agent noun to differ from the action noun in being formed on the basic stem, leading to the pattern Xation∼Xeur. Finally, as our example of a most unexpected pattern, we designed items where the action noun uses a learnèd stem in -at-, like e.g. tentation ‘temptation’ but the agent noun instead a learnt stem in -it, like e.g. répétiteur ‘tutor’. This leads to the pattern Xation∼Xiteur.
agent noun → action noun . We again build on the ambiguity of stem final -at-. Most agent nouns in -ateur are instances of a learnèd formation. In such a situation, the overwhelming majority of derivational families contain a matching action noun in -ion built on the same learnèd stem (pattern Xateur ∼ Xation). In some agent nouns however, the -at sequence is just the end of the basic stem. In that case the most prevalent way of forming an action noun is to use suffix -age to the basic stem leading to the pattern Xateur ∼ Xatage. Finally, in a handful of cases, the agent noun is a formed on a learnèd stem in -at-, but the action noun is formed by suffixation of -age to the basic stem, leading to the pattern Xateur ∼ Xage.

Table 3:

Examples of prediction difficulty in the six conditions, illustrated with real lexemes and sample pseudoword pairs from the materials, in both orthographic and phonemic transcriptions. MGL scores for the pseudoword pairs are provided.

Condition	Attested example			Sample pseudoword pair
Condition	Lexeme	Predictor	Target	Predictor	Target	Score
verb ↓ action	accomplir	accomplir	accomplissement	étiondir	étiondissement	0.73
	‘accomplish’	ak ɔ ∼ pliʁ	ak ɔ ∼ plism ɑ ∼	etj ɔ ∼ diʁ	etj ɔ ∼ dism ɑ ∼
	nourrir	nourrir	nourrissage	assarir	assarissage	0.05
	‘nourish’	nuʁiʁ	nuʁisaʒ	asaʁiʁ	asaʁisaʒ
	punir	punir	punition	avir	avition	0.02
	‘punish’	pyniʁ	pynisj ɔ ∼	aviʁ	avisi ɔ ∼
action ↓ verb	tapissement	tapissement	tapir	rurcissement	rurcir	0.92
	‘hiding’	tapissmã	tapiʁ	ʁyʁsism ɑ ∼	ʁyʁsiʁ
	crissement	crissement	crisser	rapolissement	rapolisser	0.74
	‘screech’	kʁism ɑ ∼	kʁise	ʁapolism ɑ ∼	ʁapolise
		unattested		acuissement	acuer	0.00
				ɑ ∼ kɥism ɑ ∼	ɑ ∼ kɥe
verb ↓ agent	casser	casser	casseur	pécoter	pécoteur	0.96
	‘break’	kase	kasoeʁ	pekɔte	pekɔtoeʁ
	varier	varier	variateur	sacsiler	sacsilateur	0.42
	‘vary’	vaʁje	vaʁjatoeʁ	saksile	saksilatoeʁ
	signer	signer	signataire	builer	builataire	0.00
	‘sign’	siɲe	siɲatϵʁ	bɥile	bɥilatεʁ
agent ↓ verb	animateur	animateur	animer	égérateur	égérer	0.92
	‘animator’	animatoeʁ	anime	eʒεʁatoeʁ	eʒεʁe
	gâteur	gâteur	gâter	présonateur	présonater	0.59
	‘spoiler’	ɡatoeʁ	ɡate	pʁesɔnatoeʁ	pʁesɔnate
		unattested		rébénateur	rébénatir	0.01
				ʁebenatoeʁ	ʁebenatiʁ
action ↓ agent	calibration	calibration	calibrateur	invélération	invélérateur	0.92
	‘calibration’	kalibʁasj ɔ ∼	kalibʁatoeʁ	ε ∼ veleʁasi ɔ ∼	ε ∼ veleʁatoeʁ
	climatisation	climatisation	climatiseur	vintation	vinteur	0.35
	‘climatisation’	klimatizasj ɔ ∼	klimatizoeʁ	v ε ∼ ntasj ɔ ∼	v ε ∼ ntoeʁ
		unattested		alprisation	alprisiteur	0.00
				alpʁizasj ɔ ∼	alpʁizitoeʁ
agent ↓ action	carburateur	carburateur	carburation	cossaborateur	cossaboration	0.74
	‘carburator’	kaʁbyʁatoeʁ	kaʁbyʁasj ɔ ∼	kosaboʁatoeʁ	kosaboʁasj ɔ ∼
	rabatteur	rabatteur	rabattage	énateur	énatage	0.18
	‘hemmer’	ʁabatoeʁ	ʁabataʒ	enatoeʁ	enataʒ
	filateur	filateur	filage	omninateur	omninage	0.01
	‘weaver’	filatoeʁ	filaʒ	ɔmninatoeʁ	ɔmninaʒ

Pseudoword pairs were chosen as follows. First, we collected all pairs of lexemes that instantiated the patterns of interest in the Démonette database (Hathout and Namer 2014a) and also had phonemic transcriptions in the Flexique database (Bonami et al. 2014), a machine-readable inflected lexicon of French. Pseudolexeme predictor forms were then contructed with the help of Wuggy (Keuleers and Brysbaert 2010), by choosing a pseudolexeme fitting one of the schematic shapes of interest (e.g. Xasiɔ̴̃). A regex constraint was placed on the output form to ensure that the pseudoword had the right derivational marker. The generated input pseudoword forms were then transposed into the corresponding output pseudowords by applying appropriate patterns (leading to e.g. catonisation). Note that some of the pairs chosen do not correspond to alternations instantiated by any French lexeme pair, all the exponents featured in the items are found in the French morphological system, just not necessarily ever attested together within a single morphological family.

Table 4 provides some statistics on the distribution of MGL scores across conditions, showing that these are far from being consistent. This situation is inherent in the data, but should be kept in mind when interpreting results.

Table 4:

Distribution of MGL scores across conditions.

Condition	Mean	St. Dev.	Min	25 %	Median	75 %	Max
action noun → agent noun	0.34	0.38	0.00	0.00	0.14	0.78	0.93
action noun → verb	0.42	0.47	0.00	0.00	0.37	0.84	0.92
agent noun → action noun	0.35	0.32	0.02	0.15	0.19	0.70	0.74
agent noun → verb	0.47	0.36	0.01	0.01	0.59	0.76	0.92
verb → action noun	0.18	0.25	0.02	0.02	0.04	0.34	0.74
verb → agent noun	0.39	0.36	0.00	0.00	0.43	0.69	0.96

3.2 Procedure

The data was collected using an acceptability judgement task lasting an average of 20 minutes. In each trial, participants would see a video of a person producing a single utterance, containing two morphologically related forms.

The video format of the prompt was chosen for two reasons. First, the French spelling of a word gives clues different from phonological form on the rest of the morphological family. Second, the video format made it easier to signal to participants which word form they should rate. The person in the video was instructed to use nonverbal cues such as head tilts to signal the two forms of interest. The participant was asked to rate how well the second form sounded by manipulating a slider of which only the extremes were labelled, as indicated in Figure 5. As an attention check, participants were asked to provide a synonym of the pseudolexeme every fourth item.

Figure 5:

The scale employed in the experiment. Translation: does the second word sound good, as an invented word in the context of this sentence? Extreme labels: sounds bad, sounds good.

The experiment began with a description of what we meant to capture by sounds good/bad: the naturalness of the word in French, assuming the existence of a given related word form. We gave examples of neologisms based on existing words that would be more or less likely, highlighting to participants that they should make this judgement conditional on the first form provided. There were three practice trials before the experiment began.

The participants were presented with 54 crucial items, 9 from each directed cell pair above, and 24 distractors, 6 from each of four directed inflectional cell pairs (inf → prs.1pl, prs.1pl → inf, pst.ptcp.m.sg → prs.2pl, prs.2pl → pst.ptcp.m.sg). The same sentence frame could appear with three different pseudolexeme pairs, of different levels of predictability – the level of predictability of the pseudowords that each sentence appeared with was randomised. Within items for crucial cell pairs, the three levels of predictability were uniformly distributed.

The experiment was administered on a local installation of PCIBEX (Zehr and Florian 2018), allowing us to handle all data locally, and conform to GDPR requirements.

3.3 Subjects

Sixty French native speakers were recruited on Prolific.co, and compensated 9 euro an hour for their time. Further demographic data was not collected as not crucial to the predictions of the experiment.

3.4 Analysis

We fit a mixed effects zero-and-one inflated Bayesian beta regression with the brms package in R (Bürkner 2017).

The scale was coded as having 100 points (though the numbers were not visible to the speakers, who only saw the labels on the extremes), responses were divided by 100 so they would be of the appropriate scale. The model contains the following fixed effects.

Paradigmatic predictability. To operationalize form predictability, we trained the Minimal Generalization Learner on each relevant pair of cells, using as training data all relevant pairs of forms from Démonette (Hathout and Namer 2014a) that also had a transcription in Flexique. We then asked the MGL model to predict and score all possible target forms for our predictor forms of interest. The MGL produces multiple rules of narrower or broader scope for each of the target forms it predicts from a given predictor. The score we retained is the adjusted reliability of the most reliable rule producing the target form of interest, which ranges between 0 and 1. If no rule produces the form of interest, the score is 0.
Cell Pair. The six conditions correspond to each of the (predictor cell, target cell) combinations illustrated in Table 2. This variable was treated as a factor, and was sum-coded (contrasts are applied so that the mean of each level is compared to the overall mean of the variable, as shown in Table 5).
Paradigmatic predictability: cell pair. The interaction of the two previous variables.
Phonological Well-formedness. One confound that may influence participants’ judgements is the extent to which the pseudowords created are plausible French words. To take this into account, we ran a separate experiment to gather phonological well-formedness scores for each pseudoword output form. Audio recordings were presented preceded by minimal context,^[7] and participants were asked to rate the extent to which the form sounded like a plausible French word, using the scale in Figure 5. Each word was rated by 20 participants, and no participant saw two words belonging to the same pseudolexeme. The experiment took an average of 10 min, and native French speakers recruited on Prolific.co were paid 9 euro an hour. Ratings were transformed to z-scores within each participant: a z-score of 1 therefore signifies that the item was rated one standard deviation above the average by a given participant. The phonological well-formedness score for a word corresponds to the average of its standardised ratings.

Table 5:

Contrasts for the directed cell pairs variable.

	2	3	4	5	6
AC → AG	−0.17	−0.17	−0.17	−0.17	−0.17
AC → V	0.83	−0.17	−0.17	−0.17	−0.17
AG → AC	−0.17	0.83	−0.17	−0.17	−0.17
AG → V	−0.17	−0.17	0.83	−0.17	−0.17
V → AC	−0.17	−0.17	−0.17	0.83	−0.17
V → AG	−0.17	−0.17	−0.17	−0.17	0.83

We use all these variables to predict the judgement the participants made regarding the second word in the sentence. The predicted variable can be modelled with a beta distribution, since participant judgements could range between 0 and 100 (or, rescaled, 0 and 1). Since extreme values of this scale were often used by participants and some participants only used extreme values, we chose to augment the model with zero- and one- inflation, meaning that the model treats ratings of precisely 0 and 1 as potentially having been the result of a different generation process than in-between values. As is customary in analysis of experimental data, the model has random intercept for participants and sentence frames, so that the model has awareness of which observations are correlated because they belong to the same participant or to the same sentence frame. Paradigmatic predictability and well-formedness were set as random slopes over sentence frames, and paradigmatic predictability, cell, the interaction and well-formedness were set as random slopes over participants, making this a maximal model (Barr et al. 2013) and resulting in the following formula:

judgment ∼ predictability * cell + wellformedness +
(predictability * cell + wellformedness | participant) +
(predictability + wellformedness | sentence_frame)

The second part of the analysis involves testing whether claims that the verb is treated as the base of deverbal nouns have a cognitive reality. If this is the case, we expect that when predicting towards the verb, the predictability information from the predictor cell will not be used. To test this, we single out predictions towards the verb and fit two models with the same formula as above, with a single difference: one uses MGL scores predicting towards the verb conditional on the predictor, and the other uses scores predicting towards the predictor conditional on the verb.

4 Results

The first model looks at the ensemble of the data, with the goal of discerning whether paradigmatic predictability matters in derivational paradigms. The model coefficients are shown Figure 6, while Figure 7 shows the conditional effects.

Figure 6:

Plot of the coefficient estimates. The point estimate represents the mean of the posterior draws for the parameter, the thick line is the 50 % credible interval (CrI), and the thin line is the 95 % CrI.

Figure 7:

Conditional effect plots of the model.

As can be seen in both figures, the MGL score has a clear positive effect on average: if the second form is more predictable based on the first, it will be rated better. The phonological well-formedness score of the second form also has a clear positive effect on the judgement: nonwords that more resemble existing words of the French language get better scores.

The cell pairs of interest vary in the scores assigned to them on average. This is unsurprising, given the different distribution of scores within each cell pair, as well as the different syntactic, semantic and frequency properties of the different cells. The ranking of the intercepts for the different cells do not appear to be correlated either with the entropy scores in Table 1 (which is expected, Bonami and Strnadová (2019) considered the entire lexicon, we only consider maximally opaque patterns), nor with the average MGL scores by cell within our experiment (reported in Table 4).

Turning to interactions, we see that the effect of predictability is stronger for some cell pairs than others, though it has a positive effect for all cases. Because of the deviation coding assigned to the cell condition, the coefficients in Figure 6 are comparing the slope of predictability in each cell pair to those in the cell pair with the weakest slope, action to agent. While for some cell pairs some of the probability mass is below 0, this merely reflects the fact that the slope for the cell pair may not be meaningfully different from that of the reference pair. The coefficients for these variables are all positive, confirming that compared to this baseline, the slope is at least as steep for all other cell pairs. Notably, the second strongest predictability effect can be seen when predicting the action noun from the agent noun, showing that in order to exploit predictability relationships the involvement of the verb is not required. The strongest effect of predictability is observed when predicting the verb from the agent noun, one of the situations in which models relying on the base would have predicted that predictability from the deverbal noun would have a smaller effect or no effect at all.

We move on to testing claims concerning the place of the base in a derivational system. Does it have an inherently special role, forming the basis of all prediction, or is it just another cell? We can test these predictions explicitly by analysing the subset of data in which speakers are invited to predict the verb based on one of the nouns, since the verb is uncontroversially the base for the subset of the data we are investigating. We follow Jun and Albright’s test for the prediction by correlating speaker judgement with two sets of scores: forward MGL scores, matching the intended direction of prediction, from the non-base form to the base form, and backward MGL scores, reversing the intended direction of prediction, corresponding to predicting the non-base form from the base form. A rooted tree approach would predict that even when being asked to predict the base from a non-base form, speakers are still evaluating the probability of the non-base form conditional on that of the base under the hood, since all predictions stem from the base. Backward scores are therefore predicted to perform best by a rooted tree model, while a paradigmatic model would bet on the forward scores. We compare the models with forward and reverse scores by performing Leave-One-Out Cross-Validation using a Pareto smoothed importance sampling (PSIS) approximation (Vehtari et al. 2017). The measure is derived by training the model on all but one available data points, and testing it on the withheld data point. This process is repeated as many times as there are data points, and the LOO-CV score represents the average absolute error over all these cycles. Table 6 shows that the model with backward scores performs reliably worse than the model with forward scores, since its ELPD difference with the best performing model is more than double its standard error difference.

Table 6:

PSIS-LOO of models trained on data predicting towards the verb using forward and backward MGL scores. Forward MGL scores are predictions towards the verb, backward MGL scores are predictions from the verb to the other cell.

	ELPD difference	Standard error difference
Forward scores	0.0	0.0
Backward scores	−14.0	6.8

5 Discussion

The experiment clearly supports a view of morphology that acknowledges the existence of paradigmatic relationships: relationships of predictability between pseudowords matter between all cell pairs, regardless of the direction and of whether the formal base is involved. The base therefore does not appear to hold special status in the minds of the speakers, who are instead using all available information to predict the form of the pseudolexemes and judge their acceptability.

The experiment focused on formal relationships, attempting to abstract away from lexical semantics: we know that the degree of semantic resemblance between words will influence speakers’ behavioural responses, and pairs of items in derivational relationships are rather varied in how semantically close they are (Copot et al. 2022). However, the present findings are well complemented by the work of Bonami and Guzman Naranjo (2023), who show that implicative relationships in derivational paradigms exist for meaning in the same way that they do for word form. For several pairs of cells in the French word formation system, they train statistical models to predict the distributional vector of a lexeme from the vector of a derivationally related lexeme. They find that the meaning of a word in a given derivational cell is at least somewhat predictable from the meaning of a derivationally related lexeme in a different cell, so implicative relationships of meaning exist in derivational families. Moreover, the meaning of the formal base is not always the best predictor of the meaning of a derivationally related lexeme: they find that lexemes linked by the pattern Xisme ∼ Xiste (a relationship parallel to that of English nouns like sexism ∼ sexist, discussed in Section 2.1) are better predictors of each other’s meaning compared to predicting the meaning of each from the base.

Much work in cognitive linguistics focuses on the notion of complex word (Leminen et al. (2016)’s editorial on the topic outlines the different uses of the concept in empirical work within psycholinguistics and cognitive linguistics): the term is used to describe words that can be analysed as being made up of at least one substring that can be treated as a unit (this includes derived words, but also inflected words, compounds, or words with so-called cranberry morphemes), and while it does not logically imply a rooted tree view of morphology, there is a tendency to smuggle it into the discussion. The experiment shows that speakers opportunistically build on morphological relations between words without paying attention to what might be prior or derived. This is coherent with a paradigmatic view of morphology, within which the notion of complex word is one that is at best epiphenomenal.^[8] The theory calls instead for a focus on multidirectional relationships between words, rather than for breaking them down into subunits and hierarchies based on formal cues. This is an endeavour that is close in spirit to that of usage-based linguistics: patterns of usage (Bybee 2001) are the primary factor which shapes mental linguistic representations in the same conceptual space, highlighting the importance of relationships between items sharing said space (Talmy 2003a, 2003b) leading to adaptive prototypical categories (Rosch 1973). Paradigmatic structure can itself be seen as a manifestation of a complex adaptive system, after all (Blevins et al. 2016). Cognitive linguists have historically focused on the benefit that linguistics can derive by paying attention to and interfacing with the cognitive sciences. But the cognitive commitment has positive externalities for all sides: by doing linguistics in a way that is suitable to interfacing with the cognitive sciences, research on language and cognition can also benefit from knowledge and findings in linguistics that are made legible to the cognitive community. The present paper provides one such opportunity, by highlighting that the reliance on so-called complex and simplex words in the construction of experimental stimuli in cognitively oriented linguistic research and psycholinguistics might result in misleading conclusions about language storage and processing if the paradigmatic/relational aspect of morphology is neglected.

6 Conclusions

The present study sought behavioural evidence on the status of the base and the nature of relationships between words within derivational paradigms. This is topical within morphological theory, where there are two families of approaches, one that sees the formal base as primal and relationships between items as being directed from the base outwards, and another that posits relationships between items as bidirectional, where the base therefore loses any intrinsic special status. The findings should inform not only morphological theory, but also the cognitive literature on derivational relationships, where important questions about their nature are often brushed under the rug.

Data availability statement

The data and analysis script are available at https://osf.io/4x3vb/.

Corresponding author: Maria Copot, The Ohio State University, Columbus, OH, USA, E-mail: maria.copot.s@gmail.com

Funding source: IdEx Université Paris Cité

Award Identifier / Grant number: ANR-18-IDEX-000

Acknowledgements

This work is partially supported by a public grant overseen by the IdEx Université Paris Cité (ANR-18-IDEX-0001) as part of the Labex Empirical Foundations of Linguistics – EFL. At the time that the work was conducted, Maria Copot was affiliated with Université Paris de Cité and the Laboratoire de Linguistique Formelle. This research was included in Maria Copot’s 2023 PhD dissertation, and presented at the International Symposium of Morphology conference (Nancy, 2023). We would like to thank their remarks members of the dissertation committee and of the audience at ISMo, including (in alphabetical order) Dunstan Brown, Dagmar Divjak, Nabil Hathout, Barbara Hemforth, Fabio Montermini, Erich Round, and Andrea Sims. We would like to thank Cassandre Despujols and Clara Hirst, who contributed to the creation of the experimental items and their recording during their internship. Lastly, we would like to thank Professor Maarten Lemmens and the two other anonymous reviewers, who provided thorough and useful feedback on this manuscript.

References

Ackerman, Farrell & Robert Malouf. 2013. Morphological organization: The low conditional entropy conjecture. Language 89. 429–464. https://doi.org/10.1353/lan.2013.0054.Search in Google Scholar

Ackerman, Farrell, James P. Blevins & Robert Malouf. 2009. Parts and wholes: Implicative patterns in inflectional paradigms. In James P. Blevins & Juliette Blevins (eds.), Analogy in grammar, 54–82. Oxford: Oxford University Press.10.1093/acprof:oso/9780199547548.003.0003Search in Google Scholar

Albright, Adam C. 2002. The identification of bases in morphological paradigms. Los Angeles: University of California PhD thesis.Search in Google Scholar

Albright, Adam C. & Bruce P. Hayes. 2003. Rules vs. analogy in English past tenses: A computational/experimental study. Cognition 90. 119–161. https://doi.org/10.1016/s0010-0277(03)00146-x.Search in Google Scholar

Anderson, Stephen R. 1992. A-morphous morphology. Cambridge: Cambridge University Press.10.1017/CBO9780511586262Search in Google Scholar

Aronoff, Mark. 1976. Word formation in generative grammar. Linguistic inquiry monographs. The MIT Press, Cambridge, MA, London, England. Available at: https://books.google.fr/books?id=syIXAQAAMAAJ.Search in Google Scholar

Audring, Jenny. 2022. Advances in morphological theory: Construction morphology and relational morphology. Annual Review of Linguistics 8(1). 39–58. https://doi.org/10.1146/annurev-linguistics-031120-115118.Search in Google Scholar

Baayen, R Harald, Yu-Ying Chuang, Elnaz Shafaei-Bajestan & James P. Blevins. 2019. The discriminative lexicon: A unified computational model for the lexicon and lexical processing in comprehension and production grounded not in (de) composition but in linear discriminative learning. Complexity 2019. 1–39. https://doi.org/10.1155/2019/4895891.Search in Google Scholar

Barr, Dale, Roger Levy, Christoph Scheepers & Harry J. Tily. 2013. Random effects structure for confirmatory hypothesis testing: Keep it maximal. Journal of Memory and Language 68. 255–278. https://doi.org/10.1016/j.jml.2012.11.001.Search in Google Scholar

Bauer, Laurie. 1997. Derivational paradigms. In Geert Booij & Jaap van Marle (eds.), Yearbook of Morphology 1996, 243–256. Dordrecht: Kluwer.10.1007/978-94-017-3718-0_13Search in Google Scholar

Becker, Thomas. 1993. Back-formation, cross-formation, and ‘bracketing paradoxes’ in paradigmatic morphology. In Geert Booij & Jaap van Marle (eds.), Year book of Morphology, 1–25. Dordrecht: Springer.10.1007/978-94-017-3712-8_1Search in Google Scholar

Berko Gleason, Jean. 1958. The child’s learning of English morphology. Word 14(2–3). 150–177. https://doi.org/10.1080/00437956.1958.11659661.Search in Google Scholar

Blevins, James P. 2016. Word and paradigm morphology. Oxford: Oxford University Press.10.1093/acprof:oso/9780199593545.001.0001Search in Google Scholar

Blevins, James P., Farrell Ackerman & Robert Malouf. 2016. Morphology as an adaptive discriminative system. In Daniel Sidiqqi & Heidi Harley (eds.), Morphological metatheory, 271–302. Amsterdam: Benjamins.10.1075/la.229.10bleSearch in Google Scholar

Bochner, Harry. 1993. Simplicity in generative morphology. Berlin, Boston: De Gruyter Mouton.10.1515/9783110889307Search in Google Scholar

Bonami, Olivier & Sacha Beniamine. 2016. Joint predictiveness in inflectional paradigms. Word Structure 9(2). 156–182. https://doi.org/10.3366/word.2016.0092.Search in Google Scholar

Bonami, Olivier & Matías Guzman Naranjo. 2023. Distributional evidence for derivational paradigms. In Sven Kotowski & Ingo Plag (eds.), The semantics of derivational morphology: Theory, methods, evidence, 219–258. Berlin: De Gruyter.10.1515/9783111074917-008Search in Google Scholar

Bonami, Olivier & Jana Strnadová. 2019. Paradigm structure and predictability in derivational morphology. Morphology 29. 167–197. https://doi.org/10.1007/s11525-018-9322-6.Search in Google Scholar

Bonami, Olivier, Gauthier Caron & Clément Plancq. 2014. Construction d’un lexique flexionnel phonétisé libre du français. In Quatrième Congrès mondial de linguistique française, SHS Web of Conferences 8, 2583-2596. Available at: https://hal.archives-ouvertes.fr/hal-01130598.10.1051/shsconf/20140801223Search in Google Scholar

Booij, Geert. 2010. Construction morphology. Language and Linguistics Compass 4. 543–555. https://doi.org/10.1111/j.1749-818x.2010.00213.x.Search in Google Scholar

Booij, Geert. 2013. Morphology in construction grammar. In Thomas Hoffmann & Graeme Trousdale (eds.), The Oxford Handbook of construction grammar, 255–273. New York: Oxford University Press.10.1093/oxfordhb/9780195396683.013.0014Search in Google Scholar

Bürkner, Paul-Christian. 2017. brms: An R package for Bayesian multilevel models using stan. Journal of Statistical Software 80(1). 1–28. https://doi.org/10.18637/jss.v080.i01.Search in Google Scholar

Bybee, Joan L. 1985. Morphology. A study of the relation between meaning and form. Typological Studies in Language 9(2). 493–496. https://doi.org/10.1075/tsl.9.Search in Google Scholar

Bybee, Joan L. 1995. Regular morphology and the lexicon. Language and Cognitive Processes 10(5). 425–455. https://doi.org/10.1080/01690969508407111.Search in Google Scholar

Bybee, Joan L. 2001. Phonology and language use. Cambridge studies in linguistics. Cambridge: Cambridge University Press.Search in Google Scholar

Christiansen, Morten H. & Nick Chater. 2016. Creating language: Integrating evolution, acquisition, and processing. Cambridge, MA: MIT Press.10.7551/mitpress/10406.001.0001Search in Google Scholar

Copot, Maria & Olivier Bonami. 2024. Behavioural evidence for implicative paradigmatic relations. Mental Lexicon 18(2). 177–217. https://doi.org/10.1075/ml.22020.cop.Search in Google Scholar

Copot, Maria, Timothee Mickus & Olivier Bonami. 2022. Idiosyncratic frequency as a measure of derivation vs. inflection. Journal of Language Modelling 10(2). 193–240. https://doi.org/10.15398/jlm.v10i2.301.Search in Google Scholar

Croft, William Bruce. 2000. Explaining language change: An evolutionary approach. Available at: https://api.semanticscholar.org/CorpusID:61909434.Search in Google Scholar

Diessel, Holger. 2019. The grammar network: How linguistic structure is Shaped by language use. Cambridge: Cambridge University Press.10.1017/9781108671040Search in Google Scholar

Ellis, Nick, Ute Römer & Matthew O’Donnell. 2016. Usage-based Approaches to language acquisition and processing: Cognitive and corpus Investigations of construction grammar. Oxford: Wiley-Blackwell.Search in Google Scholar

Gries, Stefan Th. 2013. 50-something years of work on collocations: What is or should be next …. International Journal of Corpus Linguistics 18(1). 137–166. https://doi.org/10.1075/ijcl.18.1.09gri.Search in Google Scholar

Gries, Stefan Th. & Nick C. Ellis. 2015. Statistical measures for usage-based linguistics. Language Learning 65(S1). 228–255. https://doi.org/10.1111/lang.12119.Search in Google Scholar

Halle, Morris & Alec Marantz. 1993. Chapter 3: Distributed morphology and the pieces of inflection. In Kenneth Hale & Samuel Jay Keyser (eds.), The View from building 20, 111–176. Cambridge, MA: MIT Press.Search in Google Scholar

Hathout, Nabil & Fiammetta Namer. 2014a. Démonette, a French derivational morpho-semantic network. Linguistic Issues in Language Technology 11(5). 125–168. https://doi.org/10.33011/lilt.v11i.1369.Search in Google Scholar

Hathout, Nabil & Fiammetta Namer. 2014b. Discrepancy between form and meaning in Word Formation:The case of over- and under-marking in French. In Franz Rainer, Francesco Gardani, Hans Christian Luschützky & Wolfgang U. Dressler (eds.), Morphology and Meaning: Selected papers from the 15th International Morphology Meeting, Vienna, February 2012, 177–190. Amsterdam: John Benjamins.10.1075/cilt.327.12hatSearch in Google Scholar

Hathout, Nabil & Fiammetta Namer. 2022. ParaDis: A family and paradigm model. Morphology 32(2). 153–195. https://doi.org/10.1007/s11525-021-09390-w.Search in Google Scholar

Hockett, Charles. 1954. Two models of grammatical description. Word 10. 386–399.10.1080/00437956.1954.11659524Search in Google Scholar

Jackendoff, Ray & Jenny Audring. 2020. The texture of the Lexicon: Relational morphology and the parallel architecture. Oxford: Oxford University Press. Available at: https://books.google.fr/books?id=NsxWzQEACAAJ.10.1093/oso/9780198827900.001.0001Search in Google Scholar

Jun, Jongho & Adam C. Albright. 2017. Speakers’ knowledge of alternations is asymmetrical: Evidence from Seoul Korean verb paradigms. Journal of Linguistics 53(3). 567–611. https://doi.org/10.1017/S0022226716000293.Search in Google Scholar

Keuleers, Emmanuel & Marc Brysbaert. 2010. Wuggy: A multilingual pseudoword generator. Behavior Research Methods 42. 627–633. https://doi.org/10.3758/brm.42.3.627.Search in Google Scholar

Koenig, Jean-Pierre. 1994. Lexical underspecification and the syntax-semantics interface. Berkeley: University of California PhD thesis.Search in Google Scholar

Koenig, Jean-Pierre. 1999. Lexical relations. Stanford, CA: CSLI Publications.Search in Google Scholar

Lakoff, George. 1990. The invariance hypothesis: Is abstract reason based on image-schemas? Cognitive Linguistics 1(1). 39–74. https://doi.org/10.1515/cogl.1990.1.1.39.Search in Google Scholar

Langacker, Ronald W. 1987. Foundation of cognitive grammar (vol. 1). Theoretical prerequisites. Stanford University Press: John Benjamins B.V.Search in Google Scholar

Langacker, Ronald W. 1988. A usage-based model. In Brygida Rudzka-Ostyn (ed.), Topics in cognitive linguistics. Amsterdam: John Benjamins.10.1075/cilt.50.06lanSearch in Google Scholar

Langacker, Ronald W. 2000. Chapter 4: A dynamic usage-based model. In Michael Barlow & Suzanne Kemmer (eds.), Grammar and conceptualization, 91–146. Berlin, New York: De Gruyter Mouton.Search in Google Scholar

Leminen, Alina, Minna Lehtonen, Mirjana Bozic & Harald Clahsen. 2016. Editorial: Morphologically complex words in the mind/brain. Frontiers in Human Neuroscience 10. 5–7. https://doi.org/10.3389/fnhum.2016.00047.Search in Google Scholar

Levelt, Willem J. M., Ardi Roelofs & Antje S. Meyer. 1999. A theory of lexical access in speech production. Behavioral and Brain Sciences 22(1). 1–38. https://doi.org/10.1017/s0140525x99001776.Search in Google Scholar

Marle, Jaap van. 1984. On the paradigmatic dimension of morphological creativity. Berlin, Boston: De Gruyter.10.1515/9783111558387Search in Google Scholar

Matthews, Peter H. 1972. Inflectional morphology. A theoretical study Based on aspects of Latin verb conjugation. Cambridge: Cambridge University Press.Search in Google Scholar

Matthews, Peter H. 1991. Morphology, 2nd edn. Cambridge: Cambridge University Press.Search in Google Scholar

Namer, Fiammetta & Nabil Hathout. 2020. ParaDis and Démonette: From theory to resources for derivational paradigms. In Proceedings of the Second International Workshop on Resources and Tools for Derivational Morphology, 5–14. Prague, Czechia: Charles University, Faculty of Mathematics, Physics, Institute of Formal, and Applied Linguistics.10.14712/00326585.001Search in Google Scholar

Rastle, Kathleen, Matthew Davies & Boris New. 2004. The broth in my brother’s brothel: Morpho-orthographic segmentation in visual word recognition. Psychonomic Bulletin & Review 6(11). 1090–1098.10.3758/BF03196742Search in Google Scholar

Riehemann, Susanne. 1998. Type-based derivational morphology. Journal of Comparative Germanic Linguistics 2(1). 49–77. https://doi.org/10.1023/a:1009746617055.10.1023/A:1009746617055Search in Google Scholar

Robins, Robert. 1959. In defence of WP. Transactions of the Philological Society 58. 116–144. https://doi.org/10.1111/j.1467-968x.1959.tb00301.x.Search in Google Scholar

Rosch, Eleanor. 1973. Natural categories. Cognitive Psychology 4(3). 328–350. https://doi.org/10.1016/0010-0285(73)90017-0.Search in Google Scholar

Sommerer, Lotte & Elena Smirnova. 2020. Nodes and networks in diachronic construction grammar. Constructional approaches to language. Amsterdam: John Benjamins Publishing Company. Available at: https://books.google.fr/books?id=tlTdDwAAQBAJ.10.1075/cal.27Search in Google Scholar

Štekauer, Pavol. 2014. Derivational paradigms. In Rochelle Lieber & Pavol Štekauer (eds.), The Oxford handbook of derivational morphology, 354–369. Oxford: Oxford University Press.Search in Google Scholar

Stump, Gregory T. 2001. Inflectional morphology. A theory of paradigm structure. Cambridge: Cambridge University Press.10.1017/CBO9780511486333Search in Google Scholar

Stump, Gregory T. 2019. Some sources of apparent gaps in derivational paradigms. Morphology 29. 271–292. https://doi.org/10.1007/s11525-018-9329-z.Search in Google Scholar

Stump, Gregory T. & Raphael Finkel. 2013. Morphological typology: From word to paradigm. Cambridge: Cambridge University Press.10.1017/CBO9781139248860Search in Google Scholar

Talmy, Leonard. 2003a. Toward a cognitive semantics. Vol. 1, concept structuring systems. Cambridge, MA: Massachusetts Institute of Technology Press.Search in Google Scholar

Talmy, Leonard. 2003b. Toward a cognitive semantics. Vol. 2, concept structuring systems. Cambridge, MA: Massachusetts Institute of Technology Press.Search in Google Scholar

Tribout, Delphine. 2020. Nominalization, verbalization or both? Insights from the directionality of noun-verb conversion in French. Zeitschrift für Wortbildung/Journal of Word Formation 4(2). 187–207. https://doi.org/10.3726/zwjw.2020.02.10.Search in Google Scholar

Vehtari, Aki, Andrew Gelman & Jonah Gabry. 2017. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing 27. 1413–1432. https://doi.org/10.1007/s11222-016-9696-4.Search in Google Scholar

Wurzel, Wolfgang Ullrich. 1984. Flexionsmorphologie und Natürlichkeit: Ein Beitrag zur morphologischen Theoriebildung. Berlin, Boston: Akademie Verlag.10.1515/9783112709658Search in Google Scholar

Wurzel, Wolfgang Ullrich. 1989. Inflectional morphology and naturalness. Dordrecht, Boston: Kluwer Academic Publishing.Search in Google Scholar

Zehr, Jeremy & Florian, Schwarz. 2018. PennController for Internet Based Experiments (IBEX). https://doi.org/10.17605/OFSF.IO/MD832Search in Google Scholar

Received: 2023-02-09

Accepted: 2023-11-29

Published Online: 2024-02-26

Published in Print: 2024-05-27

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

https://doi.org/10.1515/cog-2023-0018

Keywords for this article

morphology; derivation; Word and Paradigm; experimental linguistics

Creative Commons

BY 4.0