Abstract
This article presents a framework for modeling the human mind during translation, based on empirical data. It argues that three interrelated layers of mental processing operate simultaneously which can be detected in behavioral data: routinized/automated sequences reflected in fluent translation behavior; cognitive/reflective processes indicated by extended keystroke pauses; and affective/emotional states, which manifest in distinctive typing and eye-tracking patterns. Drawing on data from the CRITT Translation Process Research Database (TPR-DB), the article demonstrates how the temporal dynamics of keystrokes and gaze behavior can be linked to these underlying mental strata. The proposed embedded generative model is situated within broader theoretical contexts, including dual-process theories and Robinson’s (2023) ideosomatic theory of translation. In doing so, the article offers a novel empirical foundation for advancing theoretical developments in Cognitive Translation Studies.
1 Introduction
This article introduces a novel generative framework for modelling the temporal dynamics of translation behavior. It posits that translation emerges from the interaction of three embedded layers of mental processing: (A) affective/emotional states, (B) behavioral, automated translation routines, and (C) cognitive, reflective thought. This ABC model of the translating mind suggests that these embedded processes can be inferred from observable behavioral data – primarily keystrokes and gaze patterns – captured through keylogging and eye-tracking technologies. Such data have long been foundational in Translation Process Research (TPR) and Cognitive Translation and Interpretation Studies (CTIS) for analyzing cognitive effort, translator expertise, as well as the behavioral impacts of text difficulty, translation quality, and others.
Among the methods used in TPR, pause analysis – the measurement of inter-keystroke intervals (IKIs) – has served as a proxy for translation effort (e.g., Carl, Schaeffer, and Bangalore 2016; Lacruz and Shreve 2014; Vieira 2017). However, defining and interpreting pause lengths remains contested (Couto-Vale 2017; Kumpulainen 2015). In this article, I propose a reinterpretation of pause analysis through the lens of the ABC model by integrating two conceptual tools: the Task Segment Framework (Muñoz and Apfelthaler 2022) and the HOF taxonomy (Carl et al. 2024).
The TSF, introduced by Muñoz and Apfelthaler (2022), distinguishes between Tasks – automated/routinized actions such as brief keystroke bursts – and Task Segments (TSs), which are longer, intentional production units composed of a sequence of one or more Tasks. These Task Segments are separated by Task Segment Pauses (TSPs), whose duration varies by translator and is thought to reflect moments of increased cognitive demand.[1]
Complementing this, Carl et al. 2024 proposed the HOF taxonomy, which categorizes translation processes into three experiential states: Hesitation (linked to uncertainty and elevated cognitive effort), Orientation (characterized by attentional shifts and evaluative engagement), and Flow (a state of focused, effortless production often accompanied by positive affect). These affective states modulate different phases of attention and readiness for action, and are seen as responses to dynamic cognitive demands.
Both the TSF and the HOF taxonomy share a generative perspective: that internal translation processes – whether automated, reflective, or affective – leave discernible traces in behavioral data, such as IKIs structures and gaze trajectories. On this view, hidden mental states generate observable actions, and the structure of those actions reflects the underlying processes that gave rise to them. This reciprocal relationship resonates with Bayesian theories of cognition, including Predictive Processing (Seth 2021; Clark 2023) and Active Inference (Friston et al. 2023; Parr et al. 2022; Pezzulo et al. 2024), which conceptualize the human mind as operating on generative models that iteratively approximate and adapt to environmental conditions.
In light of these theoretical developments, Carl (2024) and Carl et al. (in print) propose simulating translation behavior via an artificial Active Inference agent organized into three interconnected ABC processing layers. The A-layer captures affective states and emotional responses; the B-layer reflects automated, routinized behavior; and the C-layer governs reflective and deliberate cognitive processes. Within this model, the TSF can be used to differentiate automated behavioral routines (B) from cognitive reflective behavior (C) based on pause structure, while the HOF taxonomy provides insight into the emotional states accompanying translation activity (A). Together, they support a multi-layered, embedded architecture in which translation processes unfold simultaneously across multiple cognitive-affective dimensions and timelines.
This article elaborates on that architecture by examining the relationships between Tasks, Task Segments, and HOF states using a large corpus of translation process data. Section 2 situates the proposed framework in relation to prior translation theories, highlighting its novel contribution. Section 3 outlines the TSF and HOF taxonomy and illustrates their segmentation logic using annotated progression graphs. Section 4 presents an empirical analysis of IKI distributions across 100 from-scratch translation sessions and explores the pausing patterns that define TSF segments. Section 5 offers conclusions in the light of Robinson’s (2023) ideosomatic theory, underscoring the integrative value of this model for future research in translation cognition.
2 Models of the Translation Process
Over the past 60 years, numerous approaches have emerged to explain and model translation processes in both humans and machines (MT). Despite fundamental differences, there are notable overlaps between these two approaches. This section provides a very brief review and situates the current proposal within this historical context.
Rule-based MT (RBMT), dominant until the early 1990s, modeled translation as a sequence of linguistic analyses: analysis, transfer, and generation. In Cognitive Translation and Interpreting Studies (CTIS), this mirrors the comprehension-transfer-production model (Angelone 2010), involving problem recognition, solution proposal, and evaluation. RBMT emphasized source text (ST) disambiguation, assuming transfer and target language generation to be relatively straightforward (Hutchins and Somers 1992).
The emergence of statistical MT (SMT) in the 1990s marked a shift. SMT built probabilistic models capable of generating many potential translations and selecting the most likely one using advanced decoding techniques. A similar shift occurred in Translation Studies, moving from equivalence-based linguistic theories (e.g. Nida 1964; Catford 1965) to Functionalism and Skopos Theory, which emphasized target audience needs and communicative purpose (Nord 2006; Pym 2003).
From the mid-1980s, cognitive theories were increasingly used in translation studies, focusing on the process of translation. Gile’s Effort Model (Gile 1985, 2009) highlight cognitive load and resource allocation in interpreting. Think-Aloud Protocols (TAPs) and later keystroke logging (Jakobsen 1999) offered empirical insights into mental operations, revisions, and hesitation patterns. Eye-tracking, introduced around 2005 in TPR, further revealed attention distribution and cognitive effort (Alves and Albir 2025).
Cognitive translation research has drawn from bilingualism and cognitive psychology. Theories like the Revised Hierarchical Model (Kroll and Stewart 1994) and the Bilingual Interactive Activation Model (Dijkstra and Van Heuven 2002) argue for non-selective language activation – bilinguals access all known languages simultaneously, a trait shared by multilingual large language models (LLMs), which also handle code-switching with ease.
Translation Process Research (TPR) has adopted dual-process theories, distinguishing between fast, intuitive (Type 1) and slow, deliberate (Type 2) processes (Kahneman 2011). Königs (1987), Hönig (1991), Lörscher (1991), Schaeffer and Carl (2013), Carl and Schaeffer (2017, 2019) and others made distinctions between automatized and strategic translation behaviors, under a variety of different nominations see (Alves and Albir 2025) for a recent overview.
The default-interventionist model of reasoning and decision-making (Evans and Stanovich 2013) posits that intuitive processing is the default procedure, while intervention only take place when difficulty or novelty demands attention. This idea is echoed across translation scholarship (e.g. Carl and Dragsted 2012; Dimitrova 2005; Tirkkonen-Condit 2005), and the Monitor Model (Schaeffer and Carl 2013), which assumes that automated routines are overseen by higher-level monitoring processes.
Robinson (2023), building on Peirce (1992), introduces the concept “emotional interpretant,” arguing that emotional awareness precedes conscious reasoning. Emotions influence cognitive load, attention, and decision-making (Hubscher-Davidson 2017), and shape how translators interpret and resolve ambiguity. Because emotional and cognitive states surface in observable behaviors – such as keystrokes or gaze – they may offer clues into translators’ internal states. The next sections explore taxonomies to identify automated, reflective, and affective mental processes in translation process data.

A progression graph of a small snippet of the translation session (BML12/P03 T5). The graph represents a segment of approximately 40 s (116,000–156,000 ms) of an English-to-Spanish translation. The vertical axis plots to the ST on the left side and the TT on the right side. ST and TT words are aligned on a word or phrase level. A single ST word that maps into TT phrase is marked as a multi-word unit on the right vertical axis, such as “Increasing → Una_mayor”. An ST multi-word phrase that is translated into a single word appear as TT repetitions, such as “different from their own → otras otras otras otras”. Blue dots and green diamonds indicate eye movements on the ST and TT respectively. The black and red characters are insertion and deletion respectively. Activity Units (AUs) fragment behavioral translation data into six categories, as marked in colored boxes at the bottom of the graph. The type (color) of the AU determines whether the translator is involved in reading the ST or TT, translation production, or simultaneously reading and writing (see Table 1). TSs are marked as gray boxes in the top of the graph. The rectangular box in the middle of the graph (around time 136,000–140,000) is reproduced in Figure 2 with a finer-grained segmentation of the TS into six (sub) Tasks.

This Progression graph shows a segment of approximately 4 s (136,000–140,000) in which an English segment “in the increasing” is translated into Spanish “un augmento en la”. The duration of a TS is indicated as a grey bar at the top, which is preceded and followed by a TSP (violet boxes). The TS consists of two successive Tasks which are separated by an RSP. Each Task contains one or more insertion and/or deletion keystroke(s). In Task 1 the translator produces “un a yum”. The last tree letters “yum” are presumably a typo, because these letters are successively deleted again in Task 2 (deletion “muy” is inverse order of “yum”). Task 2 then shows the correct continuation “ugmento” (to produce “augmento”) and the production of “en la”. As indicated by the “Fixation Units”, i.e., the blue and green striped boxes on the top, while the eyes of the translator fixate in the beginning and the end of this segment at several ST words (blue boxes are ST fixations), they move to the target window during the correction or the typo (green boxes are TT fixations). The Sequence of colored AUs at the bottom of the graph indicates the coordination of reading and writing behaviour (see Table 1). Note that Task 2 consists of two AUs of Type 6 and 5 which involve concurrent writing and reading of the TT and ST, respectively.
AU type | Reading/Writing activity | AU color in Figures 1 and 2 |
---|---|---|
T1 | ST reading | Blue |
T2 | TT reading | Light green |
T4 | TT production | Yellow (no occurrence in the graphs) |
T5 | ST reading with concurrent production | Red |
T6 | TT reading with concurrent production | Dark green |
T8 | No observed behavioral data for more than 1 s | Black (no occurrence in the graphs) |
3 Fragmenting Behavioral Translation Data
The temporal structure of translation has been central to Translation Process Research (TPR) since the 1980s (e.g., Königs 1987, see also Alves and Albir 2025). With the advent of keylogging tools like Translog (Jakobsen and Schou 1999) and InputLog (Leijten and Van Waes 2013), TPR has become increasingly technology-driven. With these technologies, it can be observed how translators alternate between rapid, automatic processing and more deliberate, reflective behavior. Interkey intervals (IKIs) can be measured to offer insights into the hidden cognitive states, and when combined with eye-tracking data (Carl 2012; Carl, Schaeffer, and Bangalore 2016; Hvelplund 2016), they reveal not only how translations are produced, but also what information translators take in.
This section discusses segmentation methods of behavioral translation data that aligne with the three ABC layers of the translating mind. The HOF taxonomy, mainly based on gaze data, identifies affective states (A), while the TSF defines IKI thresholds to distinguish routinized processes (B) from reflective, cognitive processes (C).
3.1 The HOF Taxonomy
Carl et al. (2024) introduced the HOF taxonomy, identifying three emotional states during translation:
Hesitation (H): Triggered by unexpected challenges, leading to pauses, re-reading, or revisions – indicating cognitive uncertainty or conflict.
Orientation (O): Marked by prolonged ST reading, reflecting the translator’s attempt to understand the text.
Flow (F): Characterized by fluent, uninterrupted production, minimal reading, and short pauses, signifying full cognitive immersion.
Figure 1 shows a sequence of HOF states (OHFOF), each composed of one or more Activity Units (AUs). AUs are fine-grained segments of behavior (Hvelplund 2016; Schaeffer et al. 2016) characterizing eye-hand coordination. AUs are categorized by reading/writing behavior and visual ST/TT focus (see Table 1). For example, the Orientation state (O) may involve a single ST reading AU, while Hesitation (H) can involve multiple AUs alternating between reading and typing.
3.2 The Task Segment Framework (TSF)
Muñoz and Apfelthaler (2022) classify IKIs into:
Delays (<200 ms): Indicate boundaries of motor programs – basic, automated sequences of keypresses.
Respites (RSPs): Unintentional pauses separating Tasks. Defined as 2 × median within-word IKIs.
Task Segment Pauses (TSPs): Intentional breaks between Task Segments (TS). Defined as 3 × median between-word IKIs.
Superpauses: Rare, prolonged breaks, corresponding to Orientation states in the HOF taxonomy.
Figure 2 zooms into a Flow state, showing a Task Segment (TS) with two Tasks separated by an RSP.
The next section explores properties of Tasks, TSs, and HOF states based on large-scale behavioral data.
4 An Empirical Investigation
This section presents an empirical analysis of the TSF and the HOF taxonomy based on a fragment of the CRITT TPR-DB (Carl, Schaeffer, and Bangalore 2016). It is partially a summary of data discussed in (Carl 2024) and (Carl et al. in print), here presented within a different perspective.
4.1 The Empirical Data
The data is part of the CRITT TPR-DB,[2] which is available under a CC BY-NC-SA license and hosted on SourceForge with full documentation.[3] The CRITT TPR-DB includes over 5,000 sessions and 600+ hours of text production from written translation, authoring, and spoken modes. Many sessions include both keystroke and gaze data.
The database processes raw logging information into 11 summary tables per session, with a total of 300+ features. This study focuses on the KD tables, detailing information about keystroke their timing, text alignment, and more. The data is part of the MultiLing sub-corpus – six short English source texts (totaling 847 words) in various genres, translated into several languages and under different modes (e.g., from-scratch, post-editing, sight translation).
This study uses only from-scratch translations of English to Arabic and English to Spanish, logged with Translog-II. Table 2 summarizes session stats, including number of keystrokes, session durations, and IKI metrics. Spanish translators were fastest (493 ms avg. IKI) while Arabic translators were slowest (844 ms). Arabic data was collected from 22 Arabic PhD students at Kent State University (Almazroei 2025); Spanish data came from native students, collected in 2012 (Mesa-Lao 2014), and has been widely used since.
Properties of the empirical data used in this study. The study e.g., AR20 is the internal name in the CRITT TPR-DB, and has no further meaning in this article.
Study name | AR20 | BML12 |
---|---|---|
Target language | ar | Es |
#Keystrokes | 37,171 | 73,619 |
Total duration (h) | 8.72 | 10.10 |
#Sessions | 40 | 60 |
#Translators | 22 | 32 |
Mean IKI. In ms (log ms) | 844 (6.7) | 493 (6.2) |
Median IKI, in ms (log ms) | 265 (5.6) | 156 (5.0) |
4.2 Pauses and Segments
It is generally assumed that translators mentally chunk the ST into portions that they can keep in memory and translate as a coherent typing segments (Malmkjær 1998). This mental chunking is reflected in the structure of IKIs. As pointed out in Section 3.2, Muñoz and Apfelthaler (2022) define two IKIs thresholds that are considered here. RSPs and TSPs depend on the median within-words IKI (WP) and the median between-word IKI (BP) of individual translators, respectively. The distinction between WP (within-words inter-keystroke pauses (IKIs)) and BP (between-words inter-keystroke pauses) is well-established in TPR. It has been shown that WPs are shorter than BPs (e.g. Kumpulainen 2015), indicating different cognitive/mental activities at the boundaries of different levels of linguistic production. These translator-specific IKI thresholds may thus help separating those distinct mental processes.
4.3 Within-word and Between-word IKIs
In the keystroke logging data, every recorded keystroke is associated with a timestamp (Carl et al. 2016). In this study, we define a word boundary to occur with any of the following keystrokes:
Word-boundary keystrokes: ` “’_.!?:=@$%&*()[]{}, where blank spaces are mapped into an underscore ‘_’.
A keystroke is classified as within-word if it is not a word-boundary keystroke and it is neither preceded nor followed by a word-boundary keystroke. A word-initial keystroke is the first (non word-boundary) keystroke of a new word. Every IKI can then be classified according to whether it occurs within a word or whether it is word initial, and the pause before these keystrokes are WPs (within-word IKI) or BPs (before word IKI). A within-word IKI (WP) is preceded by a within-word keystroke, while the IKI preceding a word initial keystroke is defined to be the between-word IKI (BP).[4]
4.4 Respites and Task Segment Pauses
As translators have different typing skills and translation styles different WPs and BPs can be expected for every translator. Since the IKI distributions are heavily right skewed, Muñoz and Apfelthaler (2022) suggest computing a median value, rather than a mean, as a basis for translator-relative pausing values. Following Muñoz and Apfelthaler, we define RSP i and TSP i for each translator i separately:
Table 3 shows the summary information for RSPs and TSPs for the 32 Spanish and the 22 Arabic translators. As already discussed above, Table 3 shows that the various inter-keystroke pausing values for Arabic translators are much higher (almost twice) than those for Spanish.
RSP and TSP values for Arabic and English data.
ar | Min | Max | Mean | Median |
---|---|---|---|---|
RSP | 312 | 1,032 | 563 | 546 |
TSP | 795 | 2,388 | 1,288 | 1,077 |
|
||||
es | ||||
|
||||
RSP | 220 | 470 | 301 | 281 |
TSP | 423 | 1,686 | 697 | 609 |
The minimum RSP duration in our data is 220 ms in the Spanish data, just above the assumed value for a Delay (200 ms, see Section 3.2), while the maximum RSP duration is 1,032 ms (in the Arabic data). The minimum TSP duration is 423 ms and the maximum is 2,388 ms. Note that this maximum is still below the assumed third peak in the IKI distribution, which ought to be around 2,697 ms Muñoz and Apfelthaler (2022).
4.5 Relating RSPs and TSPs
Figure 3 shows the distribution of RSPs on the left and TSPs on the right for 32 Spanish and 22 Arabic translators, respectively. As is the case for all IKIs, also RSPs and TSPs show larger variability for Arabic, i.e., a flatter distribution, than Spanish. However, all RSPs are – for every translator – shorter than their TSPs.

Distribution of RSPs (left) and TSPs (right) for Spanish and Arabic translators.
Figure 4 shows that RSPs tend to correlate with TSPs. This correlation is significant for Spanish (Spearman τ:0.68, p < 0.0001), while it is not significant for Arabic (Spearman τ:0.40 p:0.065).

Correlation of RSPs and TSPs.
Interestingly, as plotted in Figure 5, there is a strong correlation between the number of Tasks (as delimited by RSPs) within a TS and the number of keystrokes produced in that TS (τ:0.74 and τ:0.73, p:0.000 for Arabic and Spanish respectively). While, on average, Arabic and Spanish translators engage in the same number of 2.2 Tasks per TS, Arabic translators show a larger variation (between min:1.2 and max:3.9, median:2.1) than Spanish translators (between min:2.1 and max:3.4, median 1.94).

Correlation of total number of keystrokes per task segment and number of subtasks.
Spanish translators also produce more keystrokes per TS than Arabic translators do. A Spanish TS contains between 8 and 18 keystrokes (mean 11.2) while an Arabic TS has between 5 and 16 keystrokes (mean 9.4). A Spanish Task has between 3.9 and 7.4 keystrokes (mean 5.3) while an Arabic Task has between 2.7 and 6.0 keystrokes (mean 4.3).
We also observe a slightly negative effect of TS length on the number of keystrokes produced per Task: As the number of Tasks per TS increases, the number of keystrokes per Task decreases. This effect is significant for Spanish (τ:−0.52, p:0.002) but not for Arabic (τ:−0.18, p:0.41), which may have to do with the larger variability in the data and the smaller number of observations for our Arabic data set.
4.6 Types of Tasks
Following Muñoz and Apfelthaler (2022), we distinguish between three types of Tasks that involve different types of keystrokes:[5] an insertion Task, A, has only insertion keystrokes (corresponds to Muñoz and Apfelthaler’s ADD), a deletion Task, D (not considered in Muñoz and Apfelthaler) has only deletions, and a change Task, C, has insertions and deletions (corresponds to Muñoz and Apfelthaler’s CHANGE). We omit the SEARCH Task, since in our translation sessions we have no external research.[6]
Figure 6 shows average duration and keystrokes for the three Tasks, for Spanish and Arabic. The figure shows that there are systematic differences between the Spanish and the Arabic tasks. It can be expected that differences exist also between individual translators, and presumably also for different text types and translation goals. We showed that IKI profiles seem to be typical for specific translators, so that translators can be recognized (to some extent) by their IKI distributions. Figure 6 shows that on average all types of Tasks A, D, and C have more keystrokes for Spanish as compared to Arabic and the average duration is longer for Arabic than for Spanish.

Number of keystrokes (left) and duration (right) for the three types of Arabic and Spanish Tasks. There are more keystrokes and shorter timespans in the Spanish data.
4.7 Types of Task Segments
A TS consists of sequences of Tasks, where each Task has a label (in our current taxonomy one of A, C, or D). We consider the sequence of Tasks labels realized within a TS to characterize the type of TS. Table 4 gives a summary of the 11 most frequent TS labels which make up 75 % and 71 % percent of Spanish and Arabic data respectively.
The 11 most frequent types of TS and their percentage for Spanish and Arabic. The column #Occur shows the total number of TSs and %Spanish and %Arabic the proportion in the two languages. Dur of TS provides the averages duration of the TS in ms. The table shows the Average IKI and average Keystrokes per Task (Key/Task).
TS label | #Occur. | %Spanish | %Arabic | Dur. Of TS | Average IKI | Key/Task |
---|---|---|---|---|---|---|
A | 3,870 | 37.95 | 36.46 | 921 | 173 | 5.33 |
AA | 1,398 | 13.94 | 12.81 | 2,167 | 211 | 5.13 |
D | 753 | 7.71 | 6.59 | 504 | 121 | 4.16 |
AAA | 543 | 5.49 | 4.86 | 2,740 | 190 | 4.81 |
AAAA | 263 | 2.72 | 2.25 | 4,365 | 233 | 4.67 |
DA | 194 | 2.15 | 1.44 | 1,593 | 196 | 4.07 |
AD | 164 | 1.35 | 1.96 | 1,607 | 226 | 3.56 |
C | 164 | 1.27 | 2.08 | 641 | 152 | 4.23 |
DD | 116 | 1.42 | 0.64 | 1,183 | 122 | 4.84 |
AAAAA | 107 | 1.06 | 0.99 | 5,300 | 230 | 4.61 |
CC | 84 | 0.62 | 1.11 | 1,181 | 160 | 3.70 |
There are all together 10,356 TSs in the Arabic and English data with 892 different TS labels. More than 93 % of these TS labels, that is 833 different labels, occur less than 10 times. They account for 13.8 % of the data (i.e., 1,426 TSs). The 20 most frequent types of TS labels make up 90 % of the data. The mean and median duration for all TSs is 6,777 ms and 5,781 ms respectively and the median, mean, and maximum number of Tasks per TS is 11, 14, and 61 respectively. In contrast, the mean and median duration for the 90 % most frequent TSs, is 3479s and 3183s respectively and the number of Tasks is, on average, 3.0. Thus, most TS are relatively short, while there is a long tail of very long and diverse TS.
Table 4 provides labels of the 11 most frequent TSs, the total number of occurrences per TS, the percentage in Spanish and Arabic data, as well as their duration (in ms), the average IKI, and average number of keystrokes per Task. The most frequent TS, 38 % and 36 % of the Spanish and Arabic data, consists of a single Task A. There are on average 5.33 keystrokes for this Task with an average IKI of 173 ms.
As previously mentioned, the average number of keystrokes per Task decreases as the number of Tasks in the TS increases. There are 5.13 keystrokes per Task if the TS consists of two A Tasks, 4.81 keystrokes if the TS has three A Tasks, 4.67 for four Tasks, etc. On the other hand, the IKIs tend to increase as the TSs become longer, which suggests that typing becomes more interrupted.
Note the very strong correlation between frequencies (percentages) of the Spanish and Arabic TS labels (r = 0.998). This indicates that Spanish and Arabic translators in our dataset engage in very similar production processes (i.e., sequences of Tasks) which results in very similar relative number of occurrences. This certainly needs verification in other datasets, but it might show a language and translator independent translation universal.
Olalla-Soler (2023) investigates successive ADD tasks within a TS, separated by RSPs. He defines default translations to be a sequence of fluent typing which consists of one or more ADD tasks that have, among other things, only a few RSPs (fewer than that of 75 % of all TSs). Olalla-Soler (2023) observes that 67.8 % of his TSs were ADD-only. These ADD-only segments contained 69.5 % of the words. Our observations show that slightly more than 60 % of the TSs are A-only and they cover around 44 % of the keystrokes.
4.8 HOF States
This section assesses a subset of the Spanish and Arabic data that was manual annotations with HOF states. It consists of eight Spanish sessions and six Arabic translation sessions, as described in detail in (Carl, Sheng, Al-Ramadan 2024). Table 5 provides an overview of the number of annotated states in the eight Spanish sessions and six Arabic sessions. Despite the different absolute numbers, it is interesting to note that the percentages of H, O and F states is almost identical in the two languages.
Number and Percentages of HOF translation states in the annotated Spanish and Arabic data. There are approximately half the number of states for Arabic for 25 % less annotated data.
O | %O | F | %F | H | %H | Total | |
---|---|---|---|---|---|---|---|
ss | 183 | 30 % | 284 | 47 % | 139 | 23 % | 606 |
ar | 93 | 32 % | 132 | 45 % | 67 | 23 % | 292 |
Table 6 shows a transition matrix between HOF states for the Arabic and Spanish data. The first row indicates the state at time i from where the transition starts, while the columns indicate the transition probability into the next state at time i+1. As can be seen, the most frequent pattern is a loop over Orientation (O) and Flow (F) states. Only in 16 % and 14 % of the cases for Arabic (ar) and Spanish (es) respectively, is an Orientation state followed by a Hesitation. Both transition matrices are quite similar, with the only obvious exception that Arabic translators transition more often from a Hesitation to a successive Orientation (21 %) while this is much more unlikely for Spanish translator (9 % of the cases). In both cases, perhaps not surprisingly, the highest chances are that a translator will try to arrive at a Flow state (F).
Transition matrix between HOF states for Arabic (left and Spanish (right)). Rows add up to 100 %.
ar | es | ||||||
---|---|---|---|---|---|---|---|
To | O | F | H | O | F | H | |
From | O | – | 0.84 | 0.16 | – | 0.86 | 0.14 |
F | 0.60 | – | 0.40 | 0.60 | – | 0.40 | |
H | 0.21 | 0.79 | – | 0.09 | 0.91 | – |
4.9 Distribution of Tasks in H and F States
Table 7 shows the distribution of A, D, and C Tasks in Hesitation (H) and Flow (F) states for the two languages.[7] According to this table, as can be expected, Flow states are clearly dominated by A Tasks while deletions and additions are more equally distributed during Hesitation.
Percentage of Tasks in Flow and Hesitation states for the Spanish and Arabic data. There is clearly a higher proportion of addition (A) Tasks in Flow states but proportionally more deletion (D) Tasks during Hesitation. (Columns add up to 100 %).
ar | Es | |||
---|---|---|---|---|
H | F | H | F | |
A | 0.54 | 0.84 | 0.53 | 0.81 |
D | 0.34 | 0.08 | 0.41 | 0.08 |
C | 0.12 | 0.08 | 0.06 | 0.11 |
Table 8 confirms the assumption that different TS patterns are realized in H and in F states. The table shows the six most frequent TS labels of F and H states in the two languages. This accounts for roughly 75 % of the adjusted TSs. The table shows a very strong correlation (r = 0.993, for the first 20 labels) between the Arabic and Spanish Flow states and between the Arabic and Spanish Hesitation states (r = 0.968). The correlation between Flow and Hesitation states is slightly lower, r = 0.85 and r = 0.76 when changing both language and type of state.
Six most frequent TS labels for Flow in Hesitation states in the Arabic and Spanish. Note the identical ranking of TS frequencies in the Flow state.
F:ar | F:es | H:ar | H:es |
---|---|---|---|
A | A | A | A |
AA | AA | D | D |
AAA | AAA | AA | C |
AAAA | AAAA | C | AA |
C | C | DD | DA |
D | D | DA | CA |
5 Discussion and Conclusion
This article proposes a hierarchically embedded model for simulating the translating mind, composed of three interacting layers: (A) an affective layer addressing emotional states, (B) a behavioral layer for automated routines, and (C) a cognitive layer simulating reflective thought. The model integrates two taxonomies: the HOF taxonomy, which categorizes affective translation states, and the TSF, which identifies behavioral markers of routine and reflective processing. Using these frameworks, the article analyzes keystroke and eye-tracking data of 100 Arabic and Spanish translation sessions to scritinize these ABC-layered processes.
Human translation production is structured into sequences of motor programs and Tasks – that is, actions such as adding, modifying, or deleting text. Tasks are separated by short pauses (respites, RSPs), while Task Segments (TSs) are separated by longer Task Segment Pauses (TSPs). Following Muñoz and Apfelthaler (2022), RSPs indicate involuntary breaks (which we locate here on the B-layer), while TSPs reflect conscious, planned pauses (i.e., C-layer units). Tasks and TSs thus signal routine and reflective behavior respectively, which depend on the typing speed of the translator.
The study shows that translators exhibit individual, but consistent pausing patterns suggesting personal, recognizable translation styles. While typing/pausing styles differ widely across translators, the structure of HOF states (A-layer units) seems surprisingly consistent across translators.
Thus, although typing speed may vary significantly between Arabic and Spanish translators, the frequency and type of translation actions (e.g., insertions, deletions) are remarkably similar. This suggests that while the “how” of translation (its temporal structure) is highly individual, the “what” (the nature of changes made) remains relatively consistent across languages and translators.
The proposed ABC architecture of the translation mind resonates with Robinson’s (1991, 2023) ideosomatic theory of translation, which views translation as embodied, affective, and cognitive. This ideosomatic theory emphasizes the translator’s bodily and emotional involvement, in addition to cognition and reasoning. The term ideosomatic combines the cognitive (ideo) and the physical (somatic), positing that translators’ decisions are often driven by intuitive, affect-laden reactions to text – e.g., what “feels right” – rather than purely analytic reasoning.
Robinson (2023) draws from Peirce’s (1992) triadic model of interpretants – emotional, energetic, and logical – which closely align with the ABC layers:
Emotional interpretant (A): the immediate, intuitive feeling response that shapes attention and meaning.
Energetic interpretant (B): embodied or motor reactions, expressed in typing and gaze behavior.
Logical interpretant (C): reflective reasoning that regulates and reinterprets emotional input.
Robinson argues that translation involves a “feeling-becoming-thinking” process: smooth translation arises when emotional and motor states align, but reflective thought may be called upon when there’s a mismatch. Though feelings and thoughts are internal, Robinson insists they are also transcranial – partly observable through social and sensorimotor behavior, including gaze and typing patterns. The internal processes can thus be traced through behavioral data.
This study builds on Robinson’s ideosomatic framework analyzing how the sequencing of HOF states, Tasks, and Task Segments reflects these layered processes. While the temporal structure of translation seems to be highly individual, the underlying actions and affective states remains relatively consistent – suggesting that shared cognitive-emotional mechanisms are common across translators. These findings, in combination with the ideosomatic theory, offer a nuanced understanding of how individual styles emerge from common/shared mental architectures.
Future research may investigate how the mental layers underpinning emotional states like Flow, Hesitation, and Orientation relate to cognitive concepts, such as attention shifts, working memory load, or retrieval processes. A more granular analysis of how different task types (ADD, DELETE, CHANGE) correlate with cognitive demands and emotional responses may also yield valuable insights, especially for training and process optimization. Moreover, fuller integration of eye-tracking data – through analysis of gaze fixations and saccadic patterns – could reveal how visual attention mediates emotional states and task complexity. Finally, translating these findings into practical guidelines for translator training and the development of emotion-sensitive translation tools would increase the model’s real-world impact, enabling more adaptive and cognitively informed approaches to translation practice and pedagogy.
Acknowledgment
I would like to thank the reviewers for their insightful and constructive feedback!
-
Research funding: This research received no external funding.
-
Data availability: The data used in this article are freely available and can be downloaded from the CRITT website. The CRITT provides free server access through registration via: https://sites.google.com/site/centretranslationinnovation/tpr-db/getting-started (accessed 22 April 2025). Upon logging into the CRITT server as summer_gst, a Python notebook is available under shared/IKI_analysis.ipynb that contains the Python code.
References
Almazroei, Samar. 2025. “Comparing Self-Revision in Written Translation and Sight Translation with a Deeper Look into Speech Disfluencies.” PhD Thesis, Kent: Kent State University.Suche in Google Scholar
Alves, F., and D. Vale. 2009. “Probing the Unit of Translation in Time: Aspects of the Design and Development of a Web Application for Storing, Annotating, and Querying Translation Process Data.” Across Languages and Cultures 10 (2): 251–73. https://doi.org/10.1556/acr.10.2009.2.5.Suche in Google Scholar
Alves, Fabio, and Amparo Hurtado Albir. 2025. Translation as a Cognitive Activity: Theories, Models and Methods for Empirical Research. London and New York: Routledge.10.4324/9781003006978Suche in Google Scholar
Angelone, Erik. 2010. “Uncertainty, Uncertainty Management and Metacognitive Problem Solving in the Translation Task.” In Translation and Cognition, edited by Gregory M. Shreve, and Erik Angelone. Philadelphia: John Benjamins Publishing Company.10.1075/ata.xv.03angSuche in Google Scholar
Carl, Michael, and B. Dragsted. 2012. “Inside the Monitor Model: Processes of Default and Challenged Translation Production.” TC3, Translation: Computation, Corpora, Cognition 2 (1): 127–45. http://www.t-c3.org/index.php/t-c3/article/view/18.Suche in Google Scholar
Carl, Michael, and Martin Kay. 2011. “Gazing and Typing Activities During Translation: A Comparative Study of Translation Units of Professional and Student Translators.” Meta 56 (4): 952–75. https://doi.org/10.7202/1011262ar.Suche in Google Scholar
Carl, Michael, and Moritz Schaeffer. 2017. “Why Translation is Difficult: A Corpus-Based Study of non-Literality in Post-Editing and from-Scratch Translation.” Hermes 56: 43–57. https://doi.org/10.7146/hjlcb.v0i56.97201.Suche in Google Scholar
Carl, Michael, and Moritz J. Schaeffer. 2019. “Outline for a Relevance Theoretical Model of Machine Translation Post-Editing.” In Researching Cognitive Processes of Translation, edited by Defeng Li, Victoria Lei, and Yuanjian He, 49–67. Singapore: Springer.10.1007/978-981-13-1984-6_3Suche in Google Scholar
Carl, Michael, Moritz Schaeffer, and Srinivas Bangalore. 2016. “The CRITT Translation Process Research Database.” In New Directions in Empirical Translation Process Research, edited by Michael Carl, Srinivas Bangalore, and Moritz Schaeffer, 13–54. Singapore: Springer.10.1007/978-3-319-20358-4_2Suche in Google Scholar
Carl, Michael, Sheng Lu, and Ali Al-Ramadan. 2024. “Using Machine Learning to Validate a Novel Taxonomy of Phenomenal Translation States.” In Proceedings of the 25th Annual Conference of the European Association for Machine Translation, 480–92. Sheffield, United Kingdom: Association for Machine Translation.Suche in Google Scholar
Carl, Michael, Yuxiang Wei, Sheng Lu, Longhui Zou, Takanori Mizowaki, and Masaru Yamada. 2024. “Hesitation, Orientation, and Flow: A Taxonomy for Deep Temporal Translation Architectures.” Ampersand 12, https://doi.org/10.1016/j.amper.2024.100164.Suche in Google Scholar
Carl, Michael. 2012. “Translog-II: A Program for Recording User Activity Data for Empirical Reading and Writing Research.” LREC 12: 4108–11.Suche in Google Scholar
Carl, Michael. 2024. “An Active Inference Agent for Modeling Human Translation Processes.” Entropy 26 (8): 616. https://doi.org/10.3390/e26080616.Suche in Google Scholar
Carl, Michael. in print. “Tracing the Temporal Dynamics of Emotion and Cognition in Behavioral Translation Data.” Translation Spaces.Suche in Google Scholar
Carl, Michael, Takanori Mizowaki, Aishvarya Ray, Masaru Yamada, Devi Sri Bandaru, and Xinyue Ren. in print. “Toward a Behavioural Translation Style Space: Simulating the Temporal Dynamics of Affect, Behaviour, and Cognition in Human Translation Production.” In SKASE Journal of Translation and Interpretation.Suche in Google Scholar
Catford, J. C. 1965. A Linguistic Theory of Translation: An Essay in Applied Linguistics. London: Oxford University Press.Suche in Google Scholar
Clark, A. 2023. The Experience Machine: How Our Minds Predict and Shape Reality. New York: Pantheon Books.Suche in Google Scholar
Couto-Vale, Daniel. 2017. “What Does a Translator do When not Writing?” In Empirical Modelling of Translation and Interpreting, edited by Silvia Hansen-Schirra, Oliver Czulo, and Sascha Hofmann, 209–37. Berlin: Language Science Press.Suche in Google Scholar
Dijkstra, T., and W. J. B. van Heuven. 2002. “The Architecture of the Bilingual Word Recognition System: From Identification to Decision.” Bilingualism: Language and Cognition 5 (3): 175–97, https://doi.org/10.1017/s1366728902003012.Suche in Google Scholar
Dimitrova, Birgitta Englund. 2005. “Expertise and Explicitation in the Translation Process.” Benjamins Translation Library 64.Suche in Google Scholar
Evans, J. S. B. T., and K. E. Stanovich. 2013. “Dual Process Theories of Cognition: Advancing the Debate.” Perspectives on Psychological Science 8: 223–41.10.1177/1745691612460685Suche in Google Scholar
Friston, Karl, Lancelot Da Costa, Dalton A. R. Sakthivadivel, Conor Heins, Grigorios A. Pavliotis, Maxwell Ramstead, and Thomas Parr. 2023. “Path Integrals, Particular Kinds, and Strange Things.” Physics of Life Reviews 47: 35–62. https://doi.org/10.1016/j.plrev.2023.08.016.Suche in Google Scholar
Gile, Daniel. 1985. “Le Modèle d’Efforts et l’Équilibre d’Interprétation en Interprétation Simultanée.” Meta 30 (1): 44–8. https://doi.org/10.7202/002893ar.Suche in Google Scholar
Gile, Daniel. 2009. Basic Concepts and Models for Interpreter and Translator Training, 2nd ed. Amsterdam: John Benjamins.10.1075/btl.8Suche in Google Scholar
Hönig, Hans G. 1991. “Holmes’ ‘Mapping Theory’ and the Landscape of Mental Translation Processes.” In Translation Studies: The State of the Art. Proceedings from the First James S. Holmes Symposium on Translation Studies, edited by Kitty M. van Leuven-Zwart, and Antonius Bernardus Maria Naajkens, 77–89. Amsterdam: Rodopi.10.1163/9789004488106_010Suche in Google Scholar
Hubscher-Davidson, Séverine E. 2017. Translation and Emotion: A Psychological Perspective. London: Routledge.10.4324/9781315720388Suche in Google Scholar
Hutchins, John, and Harold Somers. 1992. An Introduction to Machine Translation. London: Academic Press Limited.Suche in Google Scholar
Hvelplund, Kristian Tangsgaard. 2016. “Cognitive Efficiency in Translation.” In Reembedding Translation Process Research, edited by Ricardo Muñoz Martín, 149–70. Amsterdam: John Benjamins.10.1075/btl.128.08hveSuche in Google Scholar
Jakobsen, Arnt L. 1999. “Logging target text production with translog.” In Copenhagen Studies in Language, 24, 9–20. Frederiksberg, Denmark: Samfundslitteratur.Suche in Google Scholar
Jakobsen, Arnt L., and Lasse Schou. 1999. “Translog documentation, Version 1.0.” In Probing the Process in Translation: Methods and Results, edited by Gyde Hansen, 1–36. Copenhagen: Samfundslitteratur.Suche in Google Scholar
Kahneman, Daniel. 2011. Thinking, Fast and Slow. London: Allen Lane.Suche in Google Scholar
Königs, Frank. 1987. “Was beim Übersetzen passiert: Theoretische Aspekte, empirische Befunde und praktische Konsequenzen.” Die neueren Sprachen 86 (2): 162–85.Suche in Google Scholar
Kroll, Judith F., and Erica Stewart. 1994. “Category Interference in Translation and Picture Naming: Evidence for Asymmetric Connections Between Bilingual Memory Representations.” Journal of Memory and Language 33 (2): 149–74, https://doi.org/10.1006/jmla.1994.1008.Suche in Google Scholar
Kumpulainen, Mika. 2015. “On the Operationalisation of ‘Pauses’ in Translation Process Research.” The International Journal for Translation and Interpreting Research 7 (1): 47–58.Suche in Google Scholar
Lacruz, Isabel, and Gregory M. Shreve. 2014. “Pauses and Cognitive Effort in Post-Editing.” In Post-Editing of Machine Translation: Processes and Applications, 246–72. Cambridge: Scholars Publishing.Suche in Google Scholar
Leijten, Mariëlle, and Luuk Van Waes. 2013. “Keystroke Logging in Writing Research: Using Inputlog to Analyze and Visualize Writing Processes.” Written Communication 30 (3): 358–92. https://doi.org/10.1177/0741088313491692.Suche in Google Scholar
Lörscher, Wolfgang. 1991. “Translation Performance, Translation Process, and Translation Strategies: A Psycholinguistic Investigation.” Tübingen: Gunter Narr.Suche in Google Scholar
Muñoz, Ricardo, and Markus Apfelthaler. 2022. “A Task Segment Framework to Study Keylogged Translation Processes.” Translation & Interpreting 14 (2). https://doi.org/10.12807/ti.114202.2022.a02.Suche in Google Scholar
Malmkjær, Kirsten. 1998. “Unit of translation.” In Routledge Encyclopedia of Translation Studies, 286–88. London: Routledge.Suche in Google Scholar
Mesa-Lao, Bartolomé. 2014. “Gaze Behaviour on Source Texts: An Exploratory Study Comparing Translation and Post-Editing.” In Post-Editing of Machine Translation: Processes and Applications, edited by Sharon O’Brien, Michael Carl, and Laura Winther Balling, 219–46. Newcastle: Cambridge Scholars Publishing.Suche in Google Scholar
Nida, Eugene A. 1964. Towards a Science of Translating: With Special Reference to Principles and Procedures Involved in Bible Translating. Leiden: Brill.10.1163/9789004495746Suche in Google Scholar
Nord, Christiane. 2006. “Translating as a Purposeful Activity: A Prospective Approach.” TEFLIN Journal 17 (2): 131–43.Suche in Google Scholar
Olalla-Soler, Christian. 2023. “Literal vs. Default translation: Challenging the Constructs with middle Egyptian translation as an Extreme case in Point.” In Sendebar: Revista de Traducción e Interpretación, 65–92. Universidad de Granada. https://revistaseug.ugr.es/index.php/sendebar.10.30827/sendebar.v34.27090Suche in Google Scholar
Parr, Thomas, Giovanni Pezzulo, and Karl J. Friston. 2022. Active Inference: The Free Energy Principle in Mind, Brain, and Behavior. Cambridge: MIT Press. https://mitpress.mit.edu/9780262045353/active-inference/.10.7551/mitpress/12441.001.0001Suche in Google Scholar
Peirce, Charles S. 1992. “The Essential Peirce: Selected Philosophical Writings.” In Peirce Edition Project. Bloomington: Indiana University Press.Suche in Google Scholar
Pezzulo, Giovanni, Thomas Parr, and Karl Friston. 2024. “Active Inference as a Theory of Sentient Behavior.” Biological Psychology 186: 108741, https://doi.org/10.1016/j.biopsycho.2023.108741.Suche in Google Scholar
Pym, Anthony. 2003. “Redefining Translation Competence in an Electronic Age: In Defence of a Minimalist Approach.” Meta 48 (4): 481–97. https://doi.org/10.7202/008533ar.Suche in Google Scholar
Robinson, Douglas. 1991. The Translator’s Turn. Baltimore: The Johns Hopkins University Press.Suche in Google Scholar
Robinson, Douglas. 2023. Questions for Translation Studies. Amsterdam and Philadelphia: Benjamins Translation Library.10.1075/btl.162Suche in Google Scholar
Schaeffer, Moritz, and Michael Carl. 2013. “Shared Representations and the Translation Process: A Recursive Model.” Translation and Interpreting Studies 8 (2): 169–90. https://doi.org/10.1075/tis.8.2.03sch.Suche in Google Scholar
Schaeffer, Moritz, Barbara Dragsted, Kristian Tangsgaard Hvelplund, Laura Winther Balling, and Michael Carl. 2016. “Word Translation Entropy: Evidence of Early Target Language Activation During Reading for Translation.” In New Directions in Empirical Translation Process Research, edited by Michael Carl, Srinivas Bangalore, and Moritz Schaeffer, 183–210. Singapore: Springer.10.1007/978-3-319-20358-4_9Suche in Google Scholar
Seth, Anil. 2021. Being You: A New Science of Consciousness. London: Faber and Faber.Suche in Google Scholar
Tirkkonen-Condit, Sonja. 2005. “The Monitor Model Revisited: Evidence from Process Research.” Meta: Translators’ Journal 50 (2): 405–14, https://doi.org/10.7202/010990ar.Suche in Google Scholar
Vieira, Lucas Nunes. 2017. “From Process to Product: Links Between Post-Editing Effort and Post-Edited quality.” Translation in Transition 133: 161–86.10.1075/btl.133.06vieSuche in Google Scholar
© 2025 the author(s), published by De Gruyter on behalf of Chongqing University, China
This work is licensed under the Creative Commons Attribution 4.0 International License.