Abstract
A corpus-based analysis of specialised phraseology can shed light on the role of phrasal context in terminology. This contribution describes the behaviour of constituents of simple and complex specialised collocations in technical texts and the way in which these are distributed in different contexts ranging from the immediate surroundings of a node to the inclusion of a much larger portion of text. The contextual behaviour of specialised collocations is exemplified by English and Italian terms in the domain of photovoltaic technology. This contribution aims to identify and classify specialised collocations along their formation modalities and contexts, as well as to discuss the impact of these phenomena on their representation in LSP lexicographic resources. Issues and options concerning the extraction of collocations and contexts are also addressed.
1 Introduction
The present article deals with the topic of specialised collocations, with the aim of discussing the distribution of constituents of collocations in technical texts and the way in which this relates to a notion of context that ranges from the immediate surroundings of a node to the inclusion of a much larger portion of text. In order to do so, we will start with a rather broad concept of context as “a frame […] that surrounds a [focal] event being examined and provides resources for its appropriate interpretation” (Goodwin and Duranti 1992: 3; cf. also Goffman 1974 for the notion of frame) but without indulging in further distinctions between context and co-text (cf. Lyons 1995) of collocation constituents. In this way we want to contribute to research on specialised phraseology by presenting an overview of different types of collocational span (Evert 2009), by examining their interplay with contextual properties and by discussing their impact on the (re)presentation of specialised collocations in LSP lexicographic resources.
Despite being a comparatively underrepresented topic in terminology research, the study of specialised phraseology can provide valuable insight into the role of phrasal context in terminology (cf. Sinclair and Carter 2004 on the phrasal nature of language). As pointed out in Giacomini et al. (2020), complex collocations in particular should be the focus of attention in general and specialised lexicography because of their high productivity and key phraseological significance (cf. also Gouws 2015: 184–185).
The contextual behaviour of specialised collocations will be exemplified by English and Italian terms in the domain of photovoltaic technology. This domain has a rich terminology which interfaces with several disciplines, making specialised collocations a varied phenomenon in their denotative content and terminological composition. Section 2 provides an operational, global definition which encompasses specialised collocations and multi-word terms and introduces a typology of specialised collocations, ranging from simple collocations to different forms of complex collocations. The typology reflects the formation processes of shorter or longer collocational sequences in technical texts.
The contribution continues in Section 3 with the description of the English and the Italian corpora and of the behaviour of specialised collocations in context. The corresponding notion of context is inferred from the observation of corpus data and involves all the distinct textual components (e.g. phrase and sentence) in which the constituents of a specialised collocation may appear in a corpus. Accordingly, a typology of contexts is presented and discussed – also from the point of view of their identification in a corpus.
Section 4 explores possibilities of presentation for the different collocational contexts in LSP lexicography and provides an entry draft for a dictionary on photovoltaic terminology. The entry structure covers all available information concerning the contextual behaviour of the collocations of the lemma and proposes a method for systematically ordering this information. The article concludes with some reflections on the relevance of the results of the study and on the need to continue to explore specialised collocations in further domains in order to fully appreciate the role they play in shaping the context of specialised texts.
2 The nature of specialised collocations
In general, an exclusive orientation of LSP collocation theory towards the collocation understanding typical of LGP phraseology appears problematic for a number of reasons, among which is the distribution of idiomatic phrasemes[1] in specialised texts. The classification of phraseological data based on the principle of ‘idiomaticity’ seems to play a different role in LSP phraseology, where specialised collocations are often assigned to the set of non-idiomatic phrasemes (cf. Gläser 2007: 487). Idiomaticity, i.e. the gradual feature of phraseological expressions that display a discrepancy between their literal and figurative meaning (Burger 2015), has a different status in LSP than in LGP and has often been described as a non-obligatory feature of specialised phrasemes (cf. Cedillo 2004: 91) in the same way as compositionality has been described as a non-central property in terminology (L’Homme and Azoulay 2020: 153). This is supposed to have a direct impact on the flexibility of the context in which phrasemes, collocations in particular, occur in specialised texts. The degree of ‘fixedness’ is also a criterion for distinguishing different types of specialised phrasemes. It certainly affects the contextual behaviour of the least fixed phraseological units, namely specialised collocations, that widely lend themselves to paradigmatic substitutions and syntagmatic modifications. Paradigmatic substitutions are found, for example, in the frequent phenomenon of substantive class formation (Cedillo 2004), i.e. it is often possible to identify, for a given noun in the collocation, a series of alternative nouns that are members of the same semantic field (cf. example a):
a) | current/overcurrent/lightning/shock/… protection |
Syntagmatic modifiability concerns the ability of a collocation to change its structure by expanding it by means of new elements (cf. example b) and/or adapting it to new inflectional forms (cf. example c).
b) | current protection > reverse current protection |
c) | battery lead > battery leads |
A further issue is the controversial status of complex terms, in particular multi-word terms (as opposed to specialised collocations) and compounds. Some authors trace a boundary between terminology and phraseology, ascribing to multi-word terms a purely naming function (“La terminologie désigne des objets et concepts alors que la phraséologie formule des relations.”, Gouadec 1994: 173; cf. also Gläser 2007: 494), whereas the function of specialised collocations would be the description of relations. This kind of strict duality also agrees with the view of multi-word terms and specialised collocations as covering complementary syntactic patterns, with the former as typical noun phrases (e.g. fault current, or PV array mounting structure) and the latter as typical verb phrases (e.g. installing a PV array or to route cables). However, as pointed out by Cedillo (2004) and Giacomini (2021), separating word combinations based on their syntactic features leads to inconsistencies when trying to explain the semantic equivalence of variants of the kind V+(P)+N (protect against overcurrent) and N+(P)+N (overcurrent protection or protection against overcurrent). Among complex terms, compounds are, according to some researchers (cf. Burger 2015: 16), equally problematic from a phraseological standpoint since they do not meet the fundamental criterion of polylexicality of phrasemes. A broader definition of ‘polylexicality’ as a combinatorial property of lexical morphemes, not exclusively of words, would help overcome this obstacle and reflect the natural semantic equivalence of some compounds, e.g. (sth is) roof-mounted, and collocations, e.g. mount (sth) on the roof.
In the context of this study, we will start from an operational definition of specialised collocation aimed at a lexicographic description of the phenomenon.
By specialised collocation we mean a combination of two or more words that is typical of a specialised language, with unitary phraseological meaning and terminological character. The terminological character of the collocation is independent of the terminological or non-terminological character of its individual constituents. The degree of idiomaticity of the specialised collocation is variable, as is the degree of its fixedness.
This is a comprehensive definition, which, relying on a phraseological but also empirical notion of collocation (cf., among others, Evert 2009 and Bartsch 2004), is aimed at providing a framework for the treatment of specialised collocations in a lexicographic resource. It allows considerable flexibility in the treatment of data extracted from corpora without precluding further restrictions, e.g. of multi-word terms in the narrow sense.
2.1 Simple and complex specialised collocations: a typology
The notion of collocation as a binary combination is still very much entrenched in both common language and special language phraseological studies. The understanding of collocations as n-ary combinations has gained some popularity on the basis of corpus evidence in the recent past (cf., among others, Seretan 2013; Gouws 2015; Tutin and Kreif 2016) but the nature of complex collocations is still largely unexplored. This is very limiting with respect to the actual syntactic and semantic scope of the phenomenon, and it also constrains the possibility of carrying out adequate contextual analysis. This view is likewise reflected in the treatment of specialised collocations in lexicographical and terminographical resources, in which the focus is primarily on two-word combinations.
In order to best analyse the context in which the constituents of a specialised collocation develop, it is necessary to introduce a typology of collocations that is not tied to a specific number of constituents or to specific restrictions on the context itself. The notion of possible context for specialised collocations will be inferred from the data we are going to examine and will not be defined in advance. By ‘simple specialised collocation’ (SSC) we mean a base collocation, terminologically not further decomposable and usually consisting of two elements. As already pointed out, the constituents of any specialised collocation may or may not be terms, without thereby affecting the terminological character of their co-occurrence. Examples of SSCs in English are:
photovoltaic system junction box system load amount of energy |
grid-connected poly-crystalline to match the voltage to charge a battery |
It is apparent from specialised texts, however, that also complex collocations play a role in the phraseology of a language, although they have been only marginally investigated in the past (cf. Giacomini et al. 2020). A study conducted in the context of learner’s lexicography on general language complex collocations in Italian and German has revealed the existence of two different kinds of complex collocation formation (ibid.):
The first type of complex collocation is built by recursive expansion. The recursive nature of collocations, described by some researchers as the property of collocation constituents to be collocational themselves (cf. Heid 1994; Seretan 2013), implies that a core collocational phrase is progressively expanded by the addition of new collocates.
The second type of complex collocation is built by argument complementarity, i.e. the concatenation of simple collocations of a verb matching two or more of its arguments. This type of formation is sometimes combined with the first one, depending on the collocational range of the constituents of simple collocations.
The hypothesis to be tested in this contribution is that the model developed in the study on general language is transferable to specialised language and is also valid for English. The application of the descriptive model of complex collocations to special language corpora will then serve to explore their contextual features. A ‘complex specialised collocation’ (CSC) will be defined as a specialised collocation derived from a SSC. Since its terminological value must by definition be preserved in the evolution from a simple collocation, in specialised languages we usually find a lower number of complex collocations than in general language, where the only valid criterion for defining a complex collocation is syntactic, semantic and phraseological typicality. Based on the previously mentioned formation types for complex collocations, we will distinguish also for LSP between ‘recursively built’ CSCs and ‘argument-related’ CSCs.
Table 1 summarises the profiles of the discussed phrasemes and provides some examples in English and Italian.
Types of simple and complex specialised collocations and examples in English and Italian. For each CSC, the original SSC is shown, and the additional constituents in a) are underlined
Specialised collocation: | Main characteristics: | English examples: | Italian examples: |
Simple specialised collocation (SSC): | Not further decomposable collocation, usually binary. | renewable energy (SSC) |
energia raggiante (SSC) |
Complex specialised collocation (CSC): a) recursively built CSC |
Expansion by the addition of a collocate, usually a modifier/specification of a constituent; can also apply to a CSC. | match the voltage (SSC) > match the nominal voltage (CSC) > match the nominal voltage of the solar array (CSC) PV system (SSC) > performance of the PV system (CSC) |
modulo al silicio (SSC) > modulo al silicio monocristallino (CSC) circuito aperto (SSC) > tensione di circuito aperto (CSC) |
b) argument-related CSC | Concatenated collocations matching the arguments of a verb. | install + solar panel + roof > to install (V) a solar panel (SSC, direct object) on a roof (locative) (CSC) |
radiazione solare + incidere + superficie > radiazione solare (SSC, subject) incidente (V) su una superficie (locative) (CSC) |
As highlighted by the examples in the table, argument-related complex collocations can be constructed not only around verbs but also other word classes that hold arguments (e.g. nouns such as performance and modulo).
The phraseological and terminological character of a collocation can be understood as a continuum that fades as the collocation expands and includes new constituents. This is also evidenced by the results of the extraction of terms and collocates, with candidates becoming less and less frequent and less and less specific according to association measures as the collocations expand. Therefore, a limit to the scope of the collocations under analysis will not be set in advance, since this limit will be assessed in each case based on corpus data. This aspect also plays a role in the presentation of collocations and contexts in LSP lexicographic resources (see Section 4).
3 Constituents of specialised collocations and their contextual behaviour
The present study is corpus-based, since the context types of specialised collocations are verified in a corpus according to the previously defined typology. Two small comparable corpora, Photovoltaics2021_en and Fotovoltaico2021_it, have been created for English and Italian, each comprising around 800,000 words and made up of handbooks and guidelines concerning the field of photovoltaic technology. The texts, which have been manually collected among online resources, are addressed to technicians and prospective technicians and cover the topics of design and installation of photovoltaic systems. From a terminological point of view, corpus texts are characterized by a high degree of specialisation in both languages.
The language of the photovoltaics domain, as part of the language related to renewable energies (e.g. hydroelectricity and wind power) is quite rich and combines terminology from base disciplines (e.g. physics, photochemistry, electrochemistry), as well as from sister disciplines (e.g. the construction industry). As a result, the collocations themselves are quite varied in their denotative content and terminological composition. The English corpus shows a considerable amount of nominalization, verb forms are often passivised, and synthetic structures like N-V or N-A compounds (e.g. roof-mounted, air-permeable) are frequent, as is fairly typical in technical texts.
The two corpora have been compiled in Sketch Engine (Kilgarriff et al. 2014) using the previously selected texts. Sketch Engine was also been used for the extraction of specialised collocations and the analysis of their contexts. Data are collected and examined first for English, and the overall result is then compared to the one obtained by analysing the Italian data. The aim is not to identify exact correspondences between the two languages at the level of distribution of different specialised collocation types and corresponding contexts, but to test the applicability of the typology and method to different languages.
The first step of the applied procedure involves the extraction of simple and complex terms using the Keywords tool. Candidate term lists are obtained based on the comparison between the two specialised corpora and the general language reference corpora English Web 2020 (enTenTen20) for English and Italian Web 2016 (itTenTen16) for Italian. They are then assigned a keyness score by the Simple maths method.
3.1 Specialised collocations and their contexts in the English corpus
From the list of candidate terms extracted from the English corpus, the most relevant single-word and multi-word terms[2] are validated and ten of them selected considering both the relative frequencies in the focus corpus and the reference corpus, as well as their keyness score. Single-word terms have been chosen in such a way that they are not constituents of the multi-word terms at the same time. The selected terms are shown hereunder in alphabetical order:
(1) | amp-hour | (2) | charge controller |
(3) | deep cycle | (4) | efficiency |
(5) | electrical installation | (6) | fault current |
(7) | grid-connected | (8) | inverter |
(9) | junction box | (10) | maximum power |
(11) | operate | (12) | photovoltaic system |
(13) | PV | (14) | roof |
(15) | shading | (16) | solar cell |
(17) | string fuse | (18) | surge |
(19) | voltage drop | (20) | wattage |
Starting from this list, the contexts of the extracted terms and their collocations are examined. For this purpose, the Word Sketch and the Multiword Sketch tools are used in combination with concordance analysis. The different types of contexts emerging for the collocations of the selected terms will now be presented and discussed by illustrating selected examples.
3.1.1 Embedded constituents in a phrase context
A new constituent is embedded within the collocation. The context is typically the restricted context of the noun phrase (less frequently: verb, adjective, prepositional phrase) in which the expansion occurs by means of single adjectival or adverbial modifiers and nouns. The result is a CSC built recursively from a SSC or a previous CSC. Embedded constituents (in parentheses) are found in the following CSCs:
(1) | amp-hour: (total) amp-hour demand; |
(2) | charge controller: ((MPPT) solar) charge controller; |
(3) | deep cycle: deep cycle (battery); |
(4) | efficiency: (power) conversion efficiency; |
(5) | electrical installation: (requirements for the) electrical installation; |
(6) | fault current: (d.c.) fault current; fault current (protection); |
(7) | grid-connected: grid-connected ((PV) system); |
(10) | maximum power: maximum power (output); |
(12) | photovoltaic system: (roof-mounted) photovoltaic system; (connect a) photovoltaic system; photovoltaic (hybrid) system; |
(13) | PV: (solar) PV system, (off-grid) PV system, ((grid-connected and) stand-alone) PV system, (installation of a) PV system; PV (module) installation; (thin film) PV cell, PV cell (material), (crystalline) PV cell; (above roof) PV array; |
(14 | roof: (asymmetric) duopitched roof; |
(17) | string fuse: (removable) string fuse; |
(18) | surge: surge suppression (device); |
(20) | wattage: (combined) rated wattage |
These phrasal contexts appear to be relatively stable, even though these collocations are sometimes expanded by the addition of free, non-collocational constituents according to the principle of syntagmatic modifiability described in Section 2. Here are a few examples of this phenomenon, in which additional, non-collocational items have been highlighted by means of square brackets:
maximum allowable residual load [available] for the solar array,
[many] stand-alone inverters,
[total] energy demand per day,
[same] nominal voltage,
load of the [existing] roof covering,
[advantages of both] lead-calcium and lead-antimony design,
[quality of these] renewable technology systems,
silicone or [other] mastic sealant.
It should be noticed, however, that constituents of word combinations characterised by a strong conceptual cohesion are not separated by the addition of free, non-collocational items. Such units of meaning are, for instance,
junction box,
fault current,
voltage drop,
short circuit,
altitude correction factor,
but also the ones mentioned in the previous examples.
3.1.2 Argument-related constituents in a sentence (or VP) context
This type of phenomenon corresponds to argument-related CSCs, with collocations matching one or more arguments of a verb predicate. The context is therefore typically the verb phrase or the sentence. This phenomenon can also apply to nominalisations that maintain the argument structure of the original verb (cf. Section 3.2). Argument-related constituents are found in the following examples:
(6) | fault current: prevent (d.c.) fault current; |
(8) | inverter: disconnect the inverter from the grid; |
(9) | junction box: wire the junction box; |
(10) | operate: interrupter operates correctly; |
(12) | photovoltaic system: connect a photovoltaic system; |
(13) | PV: size a ((grid-connected and) stand-alone) PV system, install a PV system, (electrically) connect a PV array; |
(14) | roof: mount on a (pitched) roof; |
(15) | shading: calculate the shading factor; |
(16) | solar cell: (photovoltaic) solar cell produces X volts; |
(18) | surge: protect from power surges; |
(19) | voltage drop: minimise voltage drop |
As previously pointed out, constituents of argument-related CSCs can sometimes be recursively expanded depending on their collocational range (cf. prevent fault current > prevent d.c. fault current). In the abovementioned examples, embedded constituents are again indicated in parentheses.
The described collocational and contextual behaviour is independent of the nature of the verbs involved. They are usually relatively general technical verbs that are very common in technical sublanguages and apply to a large number of entities, such as operate, minimise, prevent, disconnect, connect, install, produce, mount, charge, or cable. They primarily indicate dynamic situations, namely processes and events (Lyons 1977: 483). A few others, such as earth, ground, overload, insulate, or retrofit seem to be more specific but not exclusive of the terminology of photovoltaic systems. A further class of verbs which are present throughout the corpus are generic, non-technical verbs, such as make, bring, use, or provide.
We can observe that the most frequent collocational structure is formed by the verb and a direct object. This reflects a typically neutral style of specialised technical communication, obtained by means of passivisation or other impersonal forms. A neutral style, however, can also be produced by subject-verb collocations such as the following:
the interrupter operates correctly,
fuses are not likely to operate under short-circuit conditions,
when the earth fault interrupter operates, an alarm shall be initiated.
Whenever more than one argument of a verb is collocational in nature, concatenated, adjacent collocations are built, like in disconnect the inverter from the grid or photovoltaic solar cell produces X volts. Further examples of this kind are:
ventilation prevents excessive heat build-up,
d.c. isolator may be incorporated into the inverter,
conductors should be suitably protected from mechanical damage,
blocking diodes should be used in addition to string fuses,
the amount of sunlight falling onto the face of the PV cell affects its output,
the amount of energy produced by the array per day.
In some cases, coordinate verbs are able to build parallel CSCs:
PV systems mounted above or integrated into a pitched roof.
Generally speaking, the displayed contexts appear to be looser than the ones found in Section 3.1.1. The connection between a verb and its arguments is sometimes interrupted by non-collocational items appearing within the context of the sentence. Some examples will now be mentioned:
the amount of sunlight [hitting the array] [also] varies with…,
the PV array is [typically] mounted on fixed racks,
all d.c. constituents must be rated, [as a minimum], at Voltage: Voc(stc) x 1.15,
manual load switching is [sometimes] provided,
direct or diffuse light [(usually sunlight)] [shining on the solar cells] induces the photovoltaic effect.
Additional, non-collocational items have been highlighted by means of square brackets. They are, for instance, appositives, adverbials, or participle constructions with the function of a relative clause.
3.1.3 Remote constituents beyond the sentence level
Alongside embedded and argument-related constituents, which, as we have seen, correspond to different types of phrases and are more or less fixed in nature, we have postulated that there are also broader contexts, above the sentence level, in which the constituents of specialised collocations can be distributed. Indeed, we assume that specialised discourse, as it develops in a text, becomes homogeneous through textual cohesion and coherence (cf. De Beaugrande and Dressler 1981; Adamzik 2014). We attempted to test whether there exist in a text possibilities and modalities of distribution of lexical constituents of phrasemes beyond the sentence level, without, for the moment, investigating the causes of such distribution.[3] In doing this, we expand the analysis of collocations from the initial set of 20 combinations listed at the beginning of this section to further validated combinations, in order to obtain a broader picture of the phenomenon.
By observing the behaviour of simple and complex specialised collocations in the English corpus, we notice specific patterns of use according to which at least one constituent is explicitly or implicitly echoed in different sentences, sometimes interspersed with further sentences. This is an anaphorical repetition associated with the collocative nature of some terms. The following phenomena have been identified:
(a) | A constituent of a simple or complex collocation is explicitly repeated as such in subsequent sentences, in which it is associated with further collocations, as in the first example below, while the anaphorical character of the second item is signalled by the use of the determiner the in the second example: A charge controller is connected in between the solar panels and the batteries. The charge controller operates automatically and ensures that the maximum output of the solar panels is directed to charge the batteries without overcharging or damaging them. The inclination (or pitch) of the array is to be measured or determined from plan. The required value is the degrees from horizontal. Hence, an inclination of 0° represents a horizontal array; 90°represents a vertical array. |
This happens very often in list-based text structures:
The approach is as follows:
Establish the electrical rating of the PV array in kilowatts peak (kWp)
Determine the postcode region
Determine the array pitch
Determine the array orientation
Look up kWh/kWp (Kk) from the appropriate location specific table
Determine the shading factor of the array (SF) according to any objects blocking the horizon – using shade factor procedure set out in 3.7.7
(b) | A usually personal or demonstrative pronoun or adjective refers back towards the constituent of a simple or complex collocation in a preceding sentence, and introduces a new collocation of that constituent. |
PV specific plug and socket connectors are commonly fitted to module cables by the manufacturer. Such connectors provide a secure, durable and effective electrical contact. They also simplify and increase the safety of installation works. Battery Backup Inverters: These are special inverters which are designed to draw energy from a battery, manage the battery charge via an onboard charger, and export excess energy to the utility grid. These inverters are capable of supplying AC energy to selected loads during a utility outage and are required to have anti-islanding protection. …each layer extracts energy from each photon from a particular portion of the light spectrum that is bombarding the cell. This layering of the PV materials increases the overall efficiency… |
This second modality seems to be less frequent in the specialised texts that make up our corpus than type (a). For the sake of communicative clarity, the repetition of terms seems to be preferable to that of pronouns with anaphoric function, especially in distinct sentences, in which the distance between the pronoun and the antecedent reference could easily lead to semantic ambiguities.
(c) | A constituent of a simple or complex collocation is explicitly reiterated after one or more sentences. |
The intention does not seem properly anaphoric, yet the same collocational constituent is found in different sentences without strong connection at the level of discourse.
Where the array frame is mounted on a domestic roof or similar, the likelihood of the frame being an extraneous-conductive-part is very low – due to the type and amount of material used between the ground and the roof structure (which will mainly be non-conductive). Even in the case of an array frame being mounted on a commercial building where mostly steelwork is used, it is likely that the frame will be either isolated, …
(d) | In some cases, one constituent is implicitly reiterated in a later context, in which another constituent of the collocation appears. In the example mentioned below, battery capacity is a specialised collocation that can be identified by observing the structure of the context. We will also call this phenomenon a subtype of anaphora, since the reprise of battery in e. Capacity is just implicit. |
Battery Inputs and Specifications
Days of storage desired/required = 7 days
Depth-of-discharge limit (typical value) = 0.8
Make/ Model = Exide 6E95-11 (Deep cycle battery)
Battery cell voltage = 12 V
Capacity = 478 Amp-hour (Ah)
System voltage (battery bus voltage) = 24 V
Battery round trip efficiency = 0.85 for efficiency batteries.
These different forms of phraseological behaviour ‘distributed’ over several sentences seem to correspond to the typical structures of the textual genre in question, as well as to the typical contents and modes of technical writing, including descriptions of methods, processes and their individual steps, the repeated use of lexical elements in contiguous or distant sentences, and the schematic style of lists.
3.1.4 General remarks on the analysis of the context of specialised collocations
As indicated at the beginning of the section, specialised collocations have been extracted using the Word Sketch and the Multiword Sketch tools of the Sketch Engine, progressively widening the scope of the analysed text section, but always remaining within the sentence boundaries. This allows, in particular,
to study the behaviour of recursively built and argument-related CSCs,
to highlight their combinability, and
to observe that recursively built CSCs typically correspond to multi-word terms in the strict sense.
Contextual analysis of the constituents of specialised collocations within a sentence reveals unsurprising regularities. As soon as larger portions of the text are considered, however, the picture becomes considerably complicated.
Due to their textually irregular distribution, remote constituents beyond the sentence level are clearly more difficult to be detected in the corpus than embedded and argument-related constituents. A first phase of the analysis has been carried out manually on a part of the texts in order to identify regularities in the contextual behaviour of the specialised collocations. In a second phase, the observations made have been applied to a semi-automatic procedure, with validated simple collocates searched within textual structures larger than the sentence, i.e. paragraphs and documents, by means of the Corpus Query Language, such as in the following example:
“junction” “box” []* “junction” “box” !within < s/>
The queries are particularly challenging when dealing with anaphorical pro-forms, which are not easy to predict. In comparison with the results obtained by means of Multiword Sketches, no better results have been achieved by analysing collocation graphs through the #LancsBox tool (Brezina et al. 2020; cf. also Baker 2016; Brezina 2018a).
We have not focused on quantitative data for the time being, as we believe that the frequency of use of the latter phenomenon is closely linked to the terminology of the specialised field and the conventions of the textual genre, rather than to strictly phraseological factors. We have therefore limited ourselves to observing the types of contexts by describing them from a qualitative point of view. A quantitative analysis, on the other hand, may be of interest for comparing the two corpora and drawing preliminary conclusions on the contextual behaviour of specialised collocations in English and Italian (see Section 3.2).
In Giacomini et al. (2020), the notion of ‘conceptual range’, which refers to the syntactic level at which the concept of a complex collocation is encoded, was introduced. Complex collocations built by recursive expansion retain the same properties as the simple collocations from which they originate. A noun phrase, for instance, is expanded into a larger noun phrase by the addition of an adjectival modifier, or a verb phrase is expanded into a larger verb phrase by the addition of an adverbial modifier. The concept encoded by the complex collocation is specified at the phrase level. Concepts covered by argument-related complex collocations, on the contrary, are encoded at sentence level (or at least at verb phrase level). This level is also able to identify complex ‘scenes’ when all syntactic arguments of a verb are involved in a sequence of collocations.
As shown by the previous analysis of specialised collocations, the idea of conceptual range is also applicable to terminology and is useful for inferring a notion of collocation context from the data. The context of specialised collocations is thus understood as a frame that surrounds a focal event (Goodwin and Duranti 1992) and, specifically, as the portion of text in which components of specialised collocations appear while still being perceived as a phraseological unit, which varies both with the manner of expansion from simple to complex collocations and with the explicit or implicit anaphoric resumption of constituents from one sentence to subsequent sentences.
3.2 Comparative application to the Italian corpus
The Italian corpus of texts on photovoltaics is comparable in size and composition with the English corpus. It has been surveyed for the same phenomena as described in Section 3.1. Analysis has been carried out on the following set of terms, listed in alphabetical order:
(1) | cella fotovoltaica | (2) | corrente continua |
(3) | diodo di bypass | (4) | energia elettrica |
(5) | fonti rinnovabili | (6) | fotoelettrico nominale |
(7) | FV | (8) | generatore fotovoltaico |
(9) | impianto fotovoltaico | (10) | installare |
(11) | irraggiamento voltaggio | (12) | kW |
(13) | massima potenza | (14) | modulo fotovoltaico |
(15) | nominale | (16) | ombreggiamento |
(17) | radiazione solare | (18) | retrofit |
(19) | silicio | (20) | voltaggio |
Table 2 presents some examples for each category of context.
Examples of contextual phenomena regarding specialised collocations in the Italian corpus
Embedded constituents in a phrase context | potenza nominale variabile cella fotovoltaica al silicio fonti rinnovabili tradizionali punto di massima potenza energia elettrica e termica potenza di … kW sistema FV autonomo voltaggio di funzionamento radiazione solare al suolo effetto fotoelettrico della luce solare |
Argument-related constituents in a sentence (or VP) context |
massimizzare l’irraggiamento solare convergere la radiazione solare su una cella fotovoltaica montare un diodo di bypass generare energia elettrica misurare la corrente continua all’uscita dal generatore fotovoltaico integrare un modulo fotovoltaico nella copertura progettare e installare un impianto fotovoltaico irraggiamento su superficie inclinata fenomeni di ombreggiamento del campo fotovoltaico applicazione retrofit in facciata produzione di energia elettrica da fonti rinnovabili |
Remote constituents beyond the sentence level | Lo schema sintetizza le possibili configurazioni che caratterizzano un impianto FV. In esso sono presenti cinque insiemi, composti ciascuno da diversi elementi, che in varie configurazioni caratterizzano le tipologie di impianto. Per quanto riguarda la tecnologia, la quota di produzione di celle al silicio è in crescita e resta la predominante con il 94,2% del totale prodotto. Il silicio multicristallino con il 56,9% del mercato risulta essere il più utilizzato rispetto al monocristallino, all’amorfo e al film sottile. Tuttavia, nuova spinta sta avendo il silicio mono-cristallino […]. Prima di eseguire le misure si consigliano i seguenti controlli: – verificare che ci siano condizioni di irraggiamento stabili e che non ci siano nuvole bianche in un cono di 60° di apertura intorno al sole che possano rendere instabili le misure di radiazione solare; […] – evitare di fare verifiche tecniche-funzionali nelle giornate afose, al crescere del contenuto di umidità nell’aria aumenta la constituente di radiazione diffusa e di conseguenza il rendimento del campo fotovoltaico è più basso; un semplice espediente per capire se si è in presenza di umidità eccessiva nell’aria è quello di osservare la colorazione del cielo: se questo è di un bel blu la radiazione diffusa è molto bassa, più il colore del cielo tende al bianco più la constituente diffusa è elevata. […] – verificare che ci sia una radiazione superiore a 600 W/m2; […] |
The three context types of collocational constituents are widely present in both languages. The length of the chains of embedded CSCs is as variable as that of argument-related CSCs, which often form sequences of adjacent collocations for verbs with multiple arguments such as integrare, convergere or montare. Even at the level of remote constituents located in distinct sentences, we do not notice any obvious difference. However, a difference seems to emerge precisely in the case of verbal argument structures: much more frequently than in the English corpus, these structures are transferred to verb nominalisations. This is, for example, the case of irraggiamento su superficie inclinata (corresponding to the verbal expression: irraggiare su superficie inclinata) or produzione di energia elettrica da fonti rinnovabili (corresponding to the verbal expression: produrre energia elettrica da fonti rinnovabili) (cf. Daille 2017 for an overview of syntactic variation of this kind).
Finally, a brief quantitative analysis has been conducted to compare the collocational data in the two languages. The analysis has been applied in the two languages to the 20 simple and complex terms of reference already illustrated.
For the embedded and argument-related constituents, the calculated value was the maximum number of constituents found for the collocations of a certain term, according to the following scheme:
term: PV
> two-word collocation: PV system >
> three-word collocation: stand-alone PV system [embedded]; size a PV system [argument-related] >
> four-word collocation: grid-connected and stand-alone PV system [embedded]; algorithm sizes a PV system [argument-related, adjacent] >
> five-word collocation: algorithm sizes a stand-alone PV system [embedded + argument-related, adjacent] …
It is not useful to distinguish the two types of contexts, since, as shown in the last word combination of the above example, embedded and argument-related (sometimes adjacent) constituents are frequently mixed. A maximum of five constituents has been tested: beyond this limit, no collocational combinations have been found for the selected terms. In addition, as the number of constituents increases, so does the difficulty of extracting candidate combinations automatically, as their uniqueness in the corpus increases and they are no longer detected by the system as collocation candidates. Table 3 displays our results.
Comparison between the English and the Italian corpus for what concerns the distribution of embedded and argument-related constituents of specialised collocations. The context of refence is the phrase as well as the sentence level
Number of constituents in a specialised collocation: | ||||
2 | 3 | 4 | 5 | |
EN | 2/20 (10%) | 10/20 (50%) | 6/20 (30%) | 2/20 (10%) |
IT | 2/20 (10%) | 13/20 (65%) | 4/20 (20%) | 1/20 (5%) |
For constituents of the remote type, i.e. traceable in different sentences, the choice has been made to calculate the distance, in terms of sentences, between anaphoric pairs within the discourse, focusing on the repetition of a collocative constituent as such or through a pro-form. These two cases have not been distinguished from each other. Since the same anaphoric pair can be found at different distances at different points in the corpus, it can be accounted for more than once. Table 4 shows the results of this second quantitative assessment.
Comparison between the English and the Italian corpus for what concerns the distribution of remote constituents of specialised collocations[4]
Distance in terms of number of sentences: | ||||
0 | 1 | 2 | ≥2 | |
EN | 8/20 (40%) | 15/20 (75%) | 10/20 (50%) | 9/20 (45%) |
IT | 7/20 (35%) | 18/20 (90%) | 10/20 (50%) | 8/20 (40%) |
The amount of data observed is too small to draw relevant conclusions, but it helps to hypothesise trends that could be tested in the future. From this point of view, it is useful to look at the percentage data in the two tables for the two languages, which show very similar results. In both languages, though with a slight predominance for Italian, the number of constituents of a specialised collocation found most often within the sentence is three, followed by four. Above the sentence level, most collocational constituents tend to be repeated in the next sentence or after two sentences. Less than half of the selected terms are not subject to any type of anaphoric repetition; nearly half occur in more distant sentences. The observations made so far on the contextual data of specialised collocations will now be used to make considerations on the treatment of collocational contexts in specialised dictionaries.
4 Presenting collocational context in LSP dictionaries
In this section we will focus on possible ways of presenting the different collocational contexts in LSP dictionaries, providing guidelines for implementing observations made on corpus data. As pointed out by Gouws (2015: 184), “the inclusion of complex collocations remains important and lexicographers should negotiate the best possible way of presenting them and of making users aware of their existence”. As a consequence, this need also involves the presentation of collocations in different contexts. In existing LSP resources the focus of presentation generally falls on predominantly binary specialised collocations, for which usually no context is given or, at most, some usage examples are provided.
The variety of contexts brought to light by our analysis makes us reflect on the need to give these phenomena greater weight in lexicography. Providing the dictionary user with detailed data on the contexts of use of specialised collocations supports with high probability the textual production function of the dictionary. These data can be located within the microstructure of the dictionary in a dedicated section or be systematically substituted for generic usage examples.
Based on the example of PV system, a very frequent collocation from the English corpus, an entry draft will now be presented in which the lexicographic items related to contextual knowledge will be highlighted (Table 5). PV system serves in this entry as a lemma, although the term could alternatively be presented as a collocation of the lemma PV together with other collocations such as PV cell, PV module and PV array.
Entry draft for the term PV system containing lexicographic items related to the contextual properties of the term
PV system n. (↑PV, photovoltaic) DEFINITION: A photovoltaic (PV) system is a technology that converts solar radiation into electric current. […] COLLOCATIONS IN CONTEXT: – PHRASE LEVEL |
||||
NOUN PHRASE w/ PRE-MODIFIER: | ||||
grid-connected | PV system | |||
stand-alone | PV system | |||
solar | PV system | |||
Without batteries, a grid-connected PV system will shut down when a utility power outage occurs. [Bhatia: Course] ‣[further examples] |
||||
PREPOSITIONAL PHRASE / COMPOUND: | ||||
PV system | of … kWp | |||
≈ … kWp | PV system | |||
MPPT (Maximum Power Point Tracking) for a | PV system | |||
design of a | PV system | |||
≈ | PV system | design → to design | ||
installation of a | PV system | |||
≈ | PV system | installation → to install | ||
components of a | PV system | |||
≈ | PV system | components | ||
PV system | efficiency | |||
PV system | performance | |||
Batteries consume energy during charging and discharging, reducing the efficiency and output of the PV system by about 10 percent for lead-acid batteries. [Bhatia: Course] ‣[further examples] |
||||
VERB PHRASE: | ||||
to install a | PV system | (on a roof) → installation | ||
to design a | PV system | → design | ||
(an algorithm) sizes a | PV system | |||
a | PV system | generates power | ||
a | PV system | delivers power | ||
When designing the PV system, potential problems such as sulphation, stratification and freezing should be considered and avoided. [Bhatia: Course] ‣[further examples] | ||||
– DISCOURSE LEVEL | ||||
It is generally accepted that the installation of a typical roof-mounted PV system presents a very small increased risk of a direct lightning strike. However, this may not necessarily be the case where the PV system is particularly large, where the PV system is installed on the top of a tall building, where the PV system becomes the tallest structure in the vicinity, or where the PV system is installed in an open area such as a field. [eca: Installation guide.] | ||||
Solar PV systems require minimal maintenance, as they do not usually have moving parts. However, routine maintenance is required to ensure the solar PV system will continue to perform properly. [eca: Handbook.] | ||||
Before starting any PV system testing: (hard hat and eye protection recommended)
|
This study has shown that specialised collocations form a continuum of constituents that fit into contexts of varying length (cf. also Wahl and Gries 2018 for a study of multi-word expressions of increasing length). If the collocational range (McIntosh 1966) of a specialised term is exhausted after a certain number of constituents, as can be inferred from the results presented in Section 3.2, it is necessary to establish, early in the lexicographic process, what the spatial limit in the representation of the phraseological continuum in question can or should be.
From the perspective of textual production both in the mother tongue and in the foreign language, as well as of ‘active’ translation, the availability in the dictionary entry of typical contexts, more or less extended depending on the location, can be crucial. It is reasonable to assume, therefore, that flexibility in the coverage of such contexts is beneficial. Moreover, the typicality of the contexts can be measured in terms of frequency and strength of association in the corpus, obviously adapting the statistical validation thresholds of the candidate collocations as one gradually moves on to more extensive and thus per se (much) less frequent combinations.
The proposed microstructure contains a specific search zone dedicated to the different contexts of use of specialised collocations. Thinking of the ideal user of the LSP dictionary as a translator or technical writer with good metalinguistic skills, we have chosen to mark these contexts with syntactic labels, as shown in the abstract microstructure of the entry:
PHRASE LEVEL: NOUN PHRASE w/ PRE-MODIFIER Collocations Example(s)/ Source/ Genre |
≈ embedded constituents |
NOUN PHRASE w/ POST-MODIFIER Collocations Example(s)/ Source/ Genre |
≈ embedded constituents |
VERB PHRASE ≈ argument-related (among which: adjacent) constituents DISCOURSE LEVEL: Example(s)/ Source/ Genre |
≈ remote constituents |
We have chosen to indicate the various types of context by means of syntactic tags, without resorting to the corresponding terminology (e.g. embedded, argument-related, remote constituents) used in this study, which might not be particularly user-friendly in a lexicographic environment. At the phrase level, collocates of the lemma are highlighted and accompanied by less frequent collocates indicated in round brackets (e.g. install... on a roof). Nominalisations of verbs (e.g. components of a PV system) are referenced to the corresponding verb form (PV system component) and vice versa. Equivalences between different syntactic structures are also indicated, e.g. between a noun phrase with post-modifier and a compound (e.g. design of a PV system and PV system design).
At the discourse level, the emphasis is on the ways in which the collocative context is typically constructed in certain textual genres. Here, it is important to highlight paradigmatic cases of explicit (or implicit) anaphora by means of term repetition or pro-form (both underlined) with indication of the textual genre and source (in parentheses). These context examples are very broad but not generic, as they also focus on the collocative behaviour of terms. Each zone of the entry should be integrated with further (linked) corpus examples. All examples in the entry are followed by the indication of their source as well as of the textual genre.
The presented microstructural model can be varied in many ways, also depending on the mode of publication. Nevertheless, it introduces elements that are essential for the description of the possible context of the specialised collocations of a certain domain, such as
the subdivision of specialised collocations not on the basis of each individual syntactic structure, but of classes of contexts valid for both SSCs and CSCs;
the possibility of expanding collocations on the basis of the concrete behaviour of the terms in the corpus, without imposing a predefined scope.
5 Conclusions
This paper has focused on the role of specialised phraseology, in particular collocations, in determining a significant part of the contexts in which domain terminology is used. It contributes to corpus-based research on terminology and phraseology by providing new insights into the formation and behaviour of n-ary collocations in technical texts. Different contexts of simple and complex specialised collocations have been described. It is precisely the complex collocations that turn out to be extremely interesting from this point of view, since they are formed in accordance with different contexts. The possible contexts range from the area of the phrase to that of the sentence (around a predicate) until they cross the border of the sentence to develop in the text discourse.
A notion of collocation context has been directly inferred from the analysis of corpus data: it is the portion of the text in which specialised collocation occurs while still being perceived as a phraseological unit. The context varies both with the manner of collocation expansion from a simple to a complex collocation, and with the anaphoric resumption of collocation constituents from one sentence to subsequent sentences.
Apart from single phrases, in which collocations occur and expand in virtue of syntactic-semantic restrictions and typicality, various factors seem to intervene in the distribution of collocations above the sentence level, for example textual, communicative and pragmatic factors, such as the structural coding conventions of a textual genre already mentioned in Section 3, or possibly further functional or discursive causes (cf. terminology employed by Freixa (2013) for describing causes of term variation).
The limitations of the analysis carried out lie in the restricted possibilities of detecting and thus automatically extracting complex collocations as well as identifying remote components of collocations beyond the level of the individual sentence. In future work, new possibilities for extracting terms related to discourse analysis (Widdowson 2008; Brezina 2018b; Loureda et al. 2019) could be explored, including the contextual role of genuinely pragmatic aspects. From a genuinely computational point of view, the application of existing methodologies for collocation identification such as finite state transducers associated with metagraphs (Tutin 2017) as well as the analysis of word and sentence embeddings (cf., among others, Goldberg 2017, Reimers and Gurevych 2019), might complement the current method by building the ground for quantitative analysis.
Further experiments in data collection and processing should be carried out in new specialist areas to assess the general applicability of the model. Likewise, the ordering strategies for lexicographic data should be further investigated, varying the structure presented in this contribution according to the specific dictionary function and ideal user group, but also taking into account the possibility of covering context information concerning bilingual or multilingual data.
References
Adamzik, Kirsten. 2014. Textlinguistik: eine einführende Darstellung. Berlin: De Gruyter.Search in Google Scholar
Baker, Paul. 2016. The shapes of collocation. International Journal of Corpus Linguistics, 21(2). 139–164.10.1075/ijcl.21.2.01bakSearch in Google Scholar
Bartsch, Sabine. 2004. Structural and Functional Properties of Collocations in English. Tübingen: NarrSearch in Google Scholar
Brezina, Vaclav. 2018a. Collocation graphs and networks: Selected applications. In Pascual Cantos-Gómez & Moisés Almela-Sánchez (eds.), Lexical collocation analysis (Quantitative Methods in the Humanities and Social Sciences), 59–83. Cham: Springer.10.1007/978-3-319-92582-0_4Search in Google Scholar
Brezina, Vaclav. 2018b. Statistical choices in corpus-based discourse analysis. In Charlotte Taylor & Anna Marchi (eds.), Corpus Approaches to Discourse, 259–280. London & New York: Routledge.10.4324/9781315179346-12Search in Google Scholar
Brezina, Vaclav, Pierre Weill-Tessier & Anthony McEnery. 2020. #LancsBox v.5.x. [software]. http://corpora.lancs.ac.uk/lancsbox/Search in Google Scholar
Burger, Harald. 2015. Phraseologie: Eine Einführung am Beispiel des Deutschen (5., neu bearbeitete Auflage). Berlin: Schmidt.Search in Google Scholar
Caro Cedillo, Ana. 2004. Fachsprachliche Kollokationen: Ein übersetzungsorientiertes Datenbankmodell Deutsch-Spanisch. Tübingen: Narr.Search in Google Scholar
Corpas Pastor, Gloria & Jean-Pierre Colson. 2020. Introduction. In Gloria Corpas Pastor & Jean-Pierre Colson (eds.), Computational Phraseology (IVITRA Research in Linguistics and Literature, 24), 1–8. Amsterdam & Philadelphia: Benjamins.10.1075/ivitra.24.00pasSearch in Google Scholar
De Beaugrande, Robert-Alain & Wolfgang U. Dressler. 1981. Einführung in die Textlinguistik. Tübingen: Niemeyer.10.1515/9783111349305Search in Google Scholar
Daille, Beatrice. 2017. Term Variation in Specialised Corpora: Characterisation, automatic discovery and applications. Amsterdam & Philadelphia: Benjamins.10.1075/tlrp.19Search in Google Scholar
Evert, Stefan. 2009. Corpora and collocations. In Anke Lüdeling & Merja Kytö (eds.), Corpus Linguistics. An International Handbook (Volume 2), 1212–1248. Berlin & New York: De Gruyter Mouton.10.1515/9783110213881.2.1212Search in Google Scholar
Freixa, Judit. 2013. Otra vez sobre las causas de la variación denominativa. Debate Terminológico 09. 38–46.Search in Google Scholar
Giacomini, Laura. 2021. Phraseology in technical texts: A frame-based approach to multiword term analysis and extraction. In Carmen Mellado Blanco (ed.), Productive Patterns in Phraseology and Construction Grammar: A Multilingual Approach, 215–234. Berlin & Boston: De Gruyter.10.1515/9783110520569-009Search in Google Scholar
Giacomini, Laura, Paolo DiMuccio-Failla & Eva Lanzi. 2021. The interaction of argument structures and complex collocations: role and challenges in learner’s lexicography. In Proceedings of the EURALEX XIX International Conference, Alexandroupoli, 7–9 September, 285–293.Search in Google Scholar
Gläser, Rosemarie. 2007. Fachphraseologie. In Harald Burger, Gerold Ungeheuer & Herbert Ernst Wiegand (eds.), Handbücher zur Sprach- und Kommunikationswissenschaft / Handbooks of Linguistics and Communication Science (HSK, Vol. 1), 482–505. Berlin: De Gruyter.Search in Google Scholar
Goldberg, Yoav. 2017. Neural Network Methods in Natural Language Processing. Synthesis Lectures on Human Language Technologies (April 2017). San Rafael, CA: Morgan & Claypool Publishers.10.1007/978-3-031-02165-7Search in Google Scholar
Goodwin, Charles & Alessandro Duranti. 1992. Rethinking context: an introduction. In Alessandro Duranti & Charles Goodwin (eds.), Rethinking context: Language as an interactive phenomenon, 1-42. Cambridge: Cambridge University Press.Search in Google Scholar
Gouadec, Daniel. 1994. Nature et traitement des entités phraséologiques. In Terminologie et phraséologie: acteurs et amenageurs; actes de la deuxième Université d’Automne en Terminologie, Rennes 2, Septembre 1993, 167–193.Search in Google Scholar
Gouws, Rufus H. 2015. The presentation and treatment of collocations as secondary guiding elements in dictionaries. Lexikos 25. 170–190.10.5788/25-1-1294Search in Google Scholar
Heid, Ulrich. 1994. On ways words work together – research topics in lexical combinatorics. In Proceedings of the VI EURALEX International Congress, Amsterdam, 30 August–3 September, 226–257.Search in Google Scholar
Kilgarriff, Adam, Vit Baisa, Jan Bušta, Miloš Jakubíček, Vojtech Kovář, Jan Michelfeit, Pavel Rychlý & Vit Suchomel. 2014. The sketch engine: Ten years on. Lexicography, 1(1). 7–36.10.1007/s40607-014-0009-9Search in Google Scholar
L’Homme, Marie-Claude & Daphnée Azoulay. 2020. Collecting collocations from general and specialised corpora: A comparative analysis. In Gloria Corpas Pastor & Jean-Pierre Colson (eds.), Computational Phraseology (IVITRA Research in Linguistics and Literature, 24), 151–176. Amsterdam & Philadelphia: Benjamins.10.1075/ivitra.24.08lhoSearch in Google Scholar
Loureda, Óscar, Inés Recio Fernández, Laura Nadal & Adriana Cruz (eds.). 2019. Empirical studies of the construction of discourse. Amsterdam & Philadelphia: Benjamins.10.1075/pbns.305Search in Google Scholar
Lyons, John. 1995. Text and discourse; context and co-text. In John Lyons (ed.), Linguistic Semantics: An Introduction, 258–292. Cambridge: Cambridge University Press.10.1017/CBO9780511810213.010Search in Google Scholar
Lyons, John. 1977. Semantics. Cambridge: Cambridge University Press.Search in Google Scholar
McIntosh, Angus. 1966. Patterns and ranges. Language 37. 325–337.10.2307/411075Search in Google Scholar
Reimers, Nils & Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 3982–3992. Hong Kong, China, November 3–7, 2019.10.18653/v1/D19-1410Search in Google Scholar
Seretan, Violeta. 2013. A multilingual integrated framework for processing lexical collocations. In Adam Przepiórkowski (ed.), Computational Linguistics – Applications, 87–108. Heidelberg & New York: Springer.10.1007/978-3-642-34399-5_5Search in Google Scholar
Sinclair, John McH. & Ronald Carter (eds.). 2004. Trust the text: Language, corpus and discourse. London & New York: Routledge.10.4324/9780203594070Search in Google Scholar
Tutin, Agnès. 2017. Annotating lexical functions in corpora: Showing collocations in context. In Proceedings of the Second International Conference on the Meaning-Text Model, 498–510. Moscow: Slavic Culture Languages Publishing House.Search in Google Scholar
Tutin, Agnès & Olivier Kraif. 2016. From binary collocations to grammatically extended collocations: Some insights in the semantic field of emotions in French. Mémoires de la Société néophilologique de Helsinki, Helsinki: Société néophilologique de Helsinki, 2016, Collocations Cross-Linguistically. Corpora, Dictionaries and Language Teaching, 245–266.Search in Google Scholar
Wahl, Alexander & Stefan Th. Gries. 2018. Multi-word Expressions: A Novel Computational Approach to Their Bottom-Up Statistical Extraction. In Pascual Cantos-Gómez & Moisés Almela-Sánchez (eds.), Lexical collocation analysis (Quantitative Methods in the Humanities and Social Sciences), 85–110. Cham: Springer.10.1007/978-3-319-92582-0_5Search in Google Scholar
Widdowson, Henry George. 2008. Text, context, pretext: Critical issues in discourse analysis. New York: John Wiley & Sons.Search in Google Scholar
© 2022 Walter de Gruyter GmbH, Berlin/Boston
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles in the same Issue
- Frontmatter
- Editorial
- Editorial (English)
- Editorial (Deutsch)
- Articles
- Criteria for sample sentences in phraseological dialect dictionaries: a proposal based on GEPHRAS2
- ¿Coger con las manos en la masa es una locución o una colocación?
- The contextual behaviour of specialised collocations: typology and lexicographic treatment
- Lexical bundles in the academic writing of the Arts and Humanities: from corpus to CALL
- Proverbial markers and their significance for linguistic proverb definitions: an experimental investigation
- Polysemie, Ambiguität und Vagheit der Idiome aus kognitiver Perspektive
- Idioms in Syrian Arabic: a semantic and grammatical approach to the verb
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Obituary
- Elena Arsenteva In Memoriam (1956–2022)
Articles in the same Issue
- Frontmatter
- Editorial
- Editorial (English)
- Editorial (Deutsch)
- Articles
- Criteria for sample sentences in phraseological dialect dictionaries: a proposal based on GEPHRAS2
- ¿Coger con las manos en la masa es una locución o una colocación?
- The contextual behaviour of specialised collocations: typology and lexicographic treatment
- Lexical bundles in the academic writing of the Arts and Humanities: from corpus to CALL
- Proverbial markers and their significance for linguistic proverb definitions: an experimental investigation
- Polysemie, Ambiguität und Vagheit der Idiome aus kognitiver Perspektive
- Idioms in Syrian Arabic: a semantic and grammatical approach to the verb
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Book reviews
- Obituary
- Elena Arsenteva In Memoriam (1956–2022)