Home Education A Pilot Study on Investigating Primary School Students’ Eye Movements While Solving Compare Word Problems
Article Open Access

A Pilot Study on Investigating Primary School Students’ Eye Movements While Solving Compare Word Problems

  • Ágnes Bartalis EMAIL logo , Imre Péntek and Iuliana Zsoldos-Marchiș
Published/Copyright: November 28, 2023

Abstract

One of the most difficult types of arithmetic word problems in primary school is compare problems. Among these problems, the most problematic are those in which the relational term is not consistent with the arithmetic operation required for the solution. This study investigates how 10–11-year-old primary school pupils’ read and interpret compare word problems. The consistency effect and the differences between successful and unsuccessful problem solvers are mainly studied using eye-tracking technology. The results show that students are more successful in solving consistent language (CL) problems than inconsistent language (IL) problems. Regarding eye movements during reading and solving the problems, fixation duration on the relational sentence and numbers is longer in the case of IL problems than in the case of CL problems. Compared to successful problem solvers, unsuccessful solvers fixate longer on the relational term, the pronominal reference word, and the statement and relational sentence of IL problems, but not on numbers.

1 Introduction

Word problems are part of the school curriculum and can be considered the most difficult and complex type of problem taught in primary school (Daroczy, Wolska, Meurers, & Nuerk, 2015). Word problems can link mathematics with real-life situations, and performance in solving these problems can predict success in acquiring higher-order mathematical skills (Powell & Fuchs, 2014). In word problems, the mathematical operations are embedded in the text; thus, text comprehension skills are essential (Boonen, de Koning, Jolles, & van der Schoot, 2016; Stephany, 2021). Actually, the connection between mathematical and linguistic factors increases the difficulty level of word problems (Daroczy, Meurers, Heller, Wolska, & Nürk, 2020). Besides mathematical problem-solving and text comprehension skills, working memory has also had an important role in successful word problem solving (Andersson, 2007).

The focus of this article is on the primary school level, specifically ages 10–11. In primary school, most of the word problems are arithmetic problems whose solutions require performing mathematical operations. Based on the semantic relation describing the problem situation (increases, decreases, combinations, and comparisons of the variable values), the following types of arithmetical word problems can be identified: change, equalizing, combine, and compare. In this article, research on solving compare problems by primary school pupils is presented. Compare problems contain a static numerical relation between the values of two variables. The relational term used can be consistent or inconsistent with respect to the mathematical operation that needs to be done to obtain the solution. Inconsistent problems are more difficult for problem solvers than consistent ones (Múñez, Orrantia, & Rosales, 2013; Orrantia & Múñez, 2013; Pape, 2003; Riley & Greeno, 1988), that is called consistency effect (Hegarty, Mayer, & Green, 1992).

Most of the research about the consistency effect while solving compare problems was carried out with undergraduate students (Hegarty et al., 1992; Hegarty, Mayer, & Monk, 1995; Lewis & Mayer, 1987; Orrantia & Múñez, 2013). Research with primary school students mainly focuses on comparing solving success for different types of word problems (Boonen & Jolles, 2015; De Corte, Verschaffel, & Pauwels, 1990; Riley, Greeno, & Heller Joan, 1983), showing that compare problems are the most difficult. Some studies only focus on problem-solving abilities of primary (van der Schoot, Bakker Arkema, Horsley, & van Lieshout, 2009) or secondary school students in compare problems (Boonen et al., 2016; Múñez et al., 2013; Pape, 2003).

In this study, eye-tracking (ET) is used to monitor participants' eye fixations while reading and solving compare word problems. There is only one study conducted on primary school pupil that examined eye movements during solving compare problems (van der Schoot et al., 2009). Furthermore, the study was carried out with word problems in the Hungarian language. Compare problems in Hungarian were studied (Csíkos & Steklács, 2015) before, however, they have focused on other aspects of this type of word problem. The language of the word problem can influence the translation of the text to a mathematical model (González-Calero, Berciano, & Arnau, 2020). Thus, even the aspects related to compare problems studied in previous articles should be verified for primary school pupils and word problems in the Hungarian language. Moreover, in this research, additional aspects were analyzed regarding the reading pattern.

2 Theoretical Backgrounds

2.1 Comprehension of Mathematical Text Problem

A considerable part of the mathematical tasks in primary school is in the form of word problems. An important percentage of these problems is arithmetic word problems, which require performing some mathematical operations to obtain the solution. The mathematical operations are embedded in the text instead of presenting it in mathematical form. Thus, text comprehension skills are necessary to translate the text into mathematical operations.

The solution of a mathematical word problem is obtained by going through several stages using cognitive operations (Blum & Leiß, 2007). Understanding the real situation (the word problem), the situation model can be obtained. Simplifying and structuring the information from the situational model, the real model is created, which can be mathematizing to obtain the mathematical model. After performing the mathematical operations needed, a mathematical result is obtained, which has to be interpreted to obtain the real results. The real result is validated by the situation model, and then the result of the problem is formulated and explained.

Although difficulties could occur in all the cognitive operations presented in the above model, the construction of the situation model has a crucial role in successfully solving the problem (Stephany, 2021). Some of the solvers even skip this step and perform some operations using numbers and relational terms from the text, using the so-called direct approach (Hegarty et al., 1995). This approach could lead to incorrect solutions, for example, in the case of inconsistent language (IL) compare problem. To obtain a correct solution, the solver has to understand the context of the problem to identify the variables and the relation between them interpreting correctly the relational words, i.e., building a correct situational model.

Another crucial aspect with regard to word problem comprehension is related to the magnitude-based mental representation of the variables (Múñez et al., 2013; Orrantia & Múñez, 2013). This means that during problem-solving, the magnitude of each variable should be mentally represented, which model also reflects the relation between these variables. For a successful solution, building a situational model and magnitude-based representational model could be decisive.

2.2 Consistency Effect in Compare Word Problems

A type of arithmetical problem is the compare problem in which a static numerical relation is given between the values of two variables (Hegarty et al., 1992). Research shows that compare problems are more difficult for students than other arithmetic problems (Boonen & Jolles, 2015; Riley et al., 1983).

Compare problems can be formulated using consistent language (CL) or inconsistent language (IL). In the case of CL problems, the unknown variable is the subject of the second sentence, and the relational term is consistent with the arithmetic operation, which has to be used to find the solution (Lewis & Mayer, 1987). In the case of IL problems, the unknown variable is the object of the second sentence, and the relational term is not consistent with the arithmetical operation (Lewis & Mayer, 1987). In Table 1, examples of CL and IL problems are given. Most of the studies regarding compare problems conclude that students are less successful in solving IL than CL problems (Múñez et al., 2013; Orrantia & Múñez, 2013; Pape, 2003; Riley & Greeno, 1988), and they also need more time to solve IL problems (Múñez et al., 2013). This was labeled the consistency effect (Lewis & Mayer, 1987). The main difficulty in the case of IL problems is due to the fact that the relational term is not consistent with the mathematical operation which has to be performed; thus, students have to make a reversal of the relational term (Lewis & Mayer, 1987). Another difficulty is related to the presence of a pronominal reference in the case of IL problems (Hegarty et al., 1992); thus, students have to search for the referent of the pronoun. The third difficulty is related to the role of the unknown variable from the second sentence, which is the subject in CL problems and the object in IL problems (Lewis & Mayer, 1987). In contrast with previous studies, Boonen and Jolles (2015) report no consistent effect in their study of 7–8-year-olds.

Table 1

Four types of compare arithmetic word problems

Unmarked Marked
Consistent Ann has five nuts Ann has five nuts
Robert has three more nuts than Ann Robert has three less nuts than Ann
How many nuts does Robert have? How many nuts does Robert have?
Arithmetic operation: 5 + 3 Arithmetic operation: 5 − 3
Inconsistent Ann has five nuts Ann has five nuts
She has three more nuts than Robert She has three less nuts than Robert
How many nuts does Robert have? How many nuts does Robert have?
Arithmetic operation: 5 − 3 Arithmetic operation: 5 + 3

The relational term in compare problems can be unmarked (i.e., “more”) and marked (i.e., “less”), see examples in Table 1. Students more often commit a reversal error in marked IL problems than in unmarked ones (Boonen et al., 2016; Lewis & Mayer, 1987). By analyzing brain activity with functional magnetic resonance imaging methods while solving compare problems, the results show that in the case of problems that require performing addition for the solution, IL was associated with stronger brain activation, but the consistency effect was inverse when the solution requires subtraction (Ng, Lung, & Chang, 2021).

2.2.1 Comparisons Between High-Ability and Low-Ability Problem Solvers

In different studies, participants are divided differently in high-ability and low-ability groups. In most of the studies, the criteria refer to the results obtained on the test (De Corte et al., 1990; van der Schoot et al., 2009; Verschaffel & De Corte, 1993), but there are studies in which only the errors in the case of IL problems are taken into consideration (Hegarty et al., 1995). Also, more complex division criteria are considered, for example, using a composite measure of the accuracy and the reaction time (Ng et al., 2021). The labels used for the two groups also vary among the studies, for example, high-ability/low-ability problem solvers (De Corte et al., 1990; Verschaffel & De Corte, 1993), more/less successful problem solvers (van der Schoot et al., 2009), and successful/unsuccessful problem solvers (Hegarty et al., 1995; Ng et al., 2021).

There are some differences identified between low-ability and high-ability problem solvers. Low-ability students spend more time solving the problems (De Corte et al., 1990; Hegarty et al., 1995), they have more errors in the solution of IL problems than CL problems, and they reexamine numbers and relational terms significantly more often than high-ability students (Hegarty et al., 1995). After the first reading of the problem, high-ability students spent more time reading words than numbers, while low-ability students spent more time reading the numbers (Verschaffel & De Corte, 1993). The consistency effect in the case of marked compare problems is more pronounced for low-ability students than high-ability students (van der Schoot et al., 2009). Based on brain activity during problem-solving, low-ability problem solvers show a more pronounced consistency by operation interaction (Ng et al., 2021).

3 Methods

3.1 Scope and Hypothesis

The scope of the research is to study how primary school pupils comprehend the text and solve compare word problems by analyzing ET data. As part of this question, this article has two main aims: (1) to compare fixation durations on different text elements in the case of IL and CL problems and (2) to compare fixation durations on different text elements by successful and unsuccessful solvers. In both cases, we had formulated a hypothesis that had not been studied before.

The first hypothesis is that the fixation duration on numbers is longer in the case of IL problems than in the case of CL problems. To solve a word problem, it is evident to use the numbers from the text. So, even if someone does not really understand the relations of variables, he/she tries to do some operation with the numbers (Hegarty et al., 1995). However, when dealing with an inconsistent task, the relational term must be inverted for the correct solution (Lewis & Mayer, 1987), and it can be assumed that this difficulty affects the fixation duration of the numbers as well. Although, as mentioned before, unsuccessful solvers spend more time fixating on numbers (Hegarty et al., 1995), we are interested in whether this also applies to successful solvers. Moreover, if the reversal of the operation in IL problems increases the working memory load, a second hypothesis can be formulated as the whole relational sentence is fixated for a longer time in IL problems compared to CL problems. In addition to numbers, in this sentence, there are other important data as well. A difficulty in understanding the text of the IL compare word problems is related to the presence of the pronominal reference in the relational sentence (Hegarty et al., 1995). To identify the referent of the pronoun, the solver has to search the statement sentence. Unsuccessful problem solvers could have difficulty identifying the variable name to which the pronoun refers. Thus, the third hypothesis of this study is that unsuccessful problem solvers have longer fixation duration on the pronominal reference word in the case of IL problems compared to successful problem solvers.

The method of choice to study these questions is ET, since this technology has become highly efficient in the past years. ET is the most effective way to measure and record eye movements on different text elements, word problems in this case. Furthermore, numerous previous studies (Daroczy et al., 2015; De Corte et al., 1990; Hegarty et al., 1992; van der Schoot et al., 2009) used this technology to analyze the solving process of arithmetic word problems.

Analyzing data collected by this technology goes beyond simply determining the locus and the time of different fixations; but it can also reflect the problem-solving strategy used by the pupil. Fixation time on a determined area of interest (AOI) can indicate different parts of the strategy used, for example, identifying the relevant information in the text. Moreover, by synthesizing these data, it can be concluded whether the pupil was able to construct the situation model or used the direct approach in the problem-solving process.

3.2 Participants

Forty-two 10–11-year-old pupils (4th graders, 26 girls, and 16 boys) from Romania participated in the study. The participants were selected from four different schools where the language of instruction is Hungarian. Parental consent was obtained, and participation was voluntary. The study utilized convenience sampling to select the classes, taking into consideration to include both urban and rural classes. Pupils were randomly chosen from these classes. Five pupils were excluded from the analysis due to vertical drift or poor reading ability, which would result in data distortion.

3.3 Material

In this study, two-step compare problems were used, which were written in Hungarian language. The problems contain three sentences. The first sentence is an assignment sentence giving the numerical value of the first variable. The second sentence is relational, giving the numerical relation between the values of the two variables. The third sentence is the question that asks for the value of some quantity of the second variable. This type of problem has been used in several studies (Hegarty et al., 1992, 1995; Lewis & Mayer, 1987; van der Schoot et al., 2009); therefore, the tasks were based on the same schemata, but on different familiar contexts. From the pupils’ point of view, it was necessary to avoid the feeling that they were given the same task again and to prevent overloading working memory by comparing consecutive tasks.

The difficulty of the problems corresponds to the cognitive level and the curriculum requirements. From a methodological standpoint, it would have been preferable to include more problems, but the attentional capacity of primary school children is limited compared to adult problem solvers. With respect to arithmetical operations, the solutions required addition/subtraction and multiplication with one- and two-digit numbers. The most difficult arithmetic operations required subtraction with carrying and multiplication of a two-digit number by a one-digit number.

In the test, four problems were included (Table 2), two of them CL and two of them IL problems. In the case of both CL and IL problems, in one of them, the relational term was unmarked, and in the other one, marked. The order of the items was the same for all participants. The difficulty of the problems increased gradually, while CL and IL problems were presented alternately.

Table 2

The four problems included in the test

Unmarked Marked
Consistent At the grocery, an ice cream costs 7 RON At the supermarket, a pen costs 9 RON
At the candy shop, an ice cream costs 2 RON more than at the grocery At the stationery shop, a pen costs 3 RON less than at the supermarket
If you need to buy six ice creams, how much will you pay at the candy shop in total? If you need to buy five pens, how much will you pay at the stationery shop in total?
Inconsistent The entrance ticket to the zoo costs 25 RON The entrance ticket for ice skating without borrowing costs 25 RON
This is 7 RON more than the price of a cinema ticket This is 15 RON less than the ticket with borrowing
If the three of you want to go to the cinema, how much will you pay in total? If the four of you want to go ice-skating and borrow skates, how much will you pay in total?

3.4 Apparatus

To study the performance and text reading patterns of primary school pupils while solving arithmetic compare word problems, Tobii Pro Fusion hardware and Tobii Pro Lab Screen-Based Edition software were used to collect eye movement data. It samples real-time fixations at a 250 Hz sampling rate. The eye-tracker was connected to a laptop (with a 15.6-in., non-touch screen) and positioned beneath its screen. A separate room within the schools served as the location for data collection, which contributed to the comfortable and usual feeling and the ease of testing and measurement. We considered using a familiar environment to be more important than the advantages of the laboratory environment. During data collection, adequate lighting conditions and limitation of distractions were ensured.

3.5 Procedure

Instructions before testing consisted of first explaining to the child that four simple arithmetic word problems will appear on the screen. It was then told that position herself/himself comfortably on the chair at the beginning because it is important to stay mostly stable during working with the tasks and not cover his/her eyes. The distance between the stimuli (the laptop screen) and the participant was about 60 cm. Before starting the study, the experimenter calibrated the eye-tracker for the participants (five calibration and four validation points were used). Participants were instructed that word problems would appear on the screen, and their task was to read the problem from the screen and write down the arithmetical calculations and the solution of each task on the answer sheet. Participants were also told that there are no time constraints and they can work at their own pace. Moving from one problem to another was managed by the experimenter. Pupils were not allowed to touch the laptop.

3.6 Data Analysis

For the two types of items (CL and IL), different AOIs were established. Common areas were the following: numbers, relational terms, and sentences of the problems (sentence 1 – statement sentence, sentence 2 – relational sentence, and sentence 3 – question). For the CL problems, one additional AOI was defined: the subject in the relational statement. For the IL problems, the pronominal reference word in the relational sentence was defined as additional AOI. For the established AOIs, fixation duration and number of fixations were recorded, as well as solution time for every problem. In the data analysis, descriptive statistics (mean, standard deviation) and comparisons with non-parametrical tests (Friedman test, Conover’s post hoc test, Wilcoxon matched-pairs signed rank test, and Mann–Whitney’s test) were performed.

4 Results

4.1 Fixation Duration for Items and Correctness of the Solution

By analyzing the answer sheets, all solutions are considered correct where the pupil wrote the arithmetic operation correctly, even if the final result was incorrect due to a minor calculation error.

Figure 1 shows how many of the problems were solved correctly. One-third of the pupils successfully solved all the problems, one-third of the pupils successfully solved three problems, one-third of the pupils successfully solved one or two problems, and there were three who failed to complete any problem correctly.

Figure 1 
                  Number of successful and unsuccessful solutions per item.
Figure 1

Number of successful and unsuccessful solutions per item.

In the following, when speaking about performance, pupils are divided into two groups based on their results: successful solvers and unsuccessful solvers. Successful solvers have solved at least three problems correctly, and unsuccessful solvers have solved 0, 1, or 2 problems correctly. Considering previous studies (Boonen et al., 2016; van der Schoot et al., 2009), grouping pupils by their arithmetical performance was necessary to identify whether there is any difference regarding their reading pattern, specifically with reference to the third hypothesis.

The success rate was different for each item (Figure 1). Item 3 proved to be the easiest, followed by Item 1 (these two items are CL problems). The fact that no warm-up task was offered prior to Item 1 may have negatively influenced the success rate of this problem. Items 2 and 4 proved to be the most difficult, as they are both IL problems.

The performance presented above and the solution time of each problem are not clearly related. The pupils spent the least time on Item 3 (Table 3), which proved to be the easiest problem. Although Item 1 had a high correct solution rate, participants spent the most time on average on this problem. A Friedman test was carried out to compare the total solution duration for the four items. A significant difference was found between the solution duration of the items, χ 2 (3) = 15,800, p = 0.001. Based on Conover’s post hoc test, a significant difference was found between Items 1 and 3 (p = 0.001), Items 2 and 3 (p = 0.001), and Items 3 and 4 (p = 0.010). The solution duration for the two IL problems, Items 2 and 4, is similar (Table 3). A significant negative correlation between average solution duration and performance was found: unsuccessful problem solvers spend more time solving the problems (r = −0.439, p = 0.004).

Table 3

Solution duration (milliseconds) by item

Item Minimum Maximum M SD
Item 1 (CL, unmarked) 30,532 613,825 135999.62 109020.23
Item 2 (IL, unmarked) 30,982 320,903 121605.60 65427.13
Item 3 (CL, marked) 16,666 340,903 94954.29 62892.53
Item 4 (IL, marked) 40,248 466,597 124498.36 84210.64
Total solution duration 167,759 1,391,959 477057.88 260301.34

4.2 Fixation Durations for the Sentences of the Problems

To compare fixation duration on different sentences of the problems, fixation durations were weighted by the number of characters, the comparisons were made on the average fixation duration on a character from each sentence.

Comparing the average fixation duration on sentences in the case of the consistent problems, the Friedman test shows significant differences, χ 2 (2) = 41.333, p < 0.001. The decreasing order of average fixation durations on sentences is the following: Sentence 1 (M = 607.91, SD = 382.80), Sentence 2 (M = 524.62, SD = 316.10), and Sentence 3 (M = 398.00, SD = 261.15). Conovers’ post hoc test indicates significant differences between the average fixations on characters from Sentences 1 and 2 (p = 0.032), Sentences 1 and 3 (p < 0.001), and Sentences 2 and 3 (p < 0.001).

Comparing the average fixation duration on sentences in the case of the inconsistent problems, the Friedman test shows significant differences, χ 2(2) = 41.333, p < 0.001. The decreasing order of average fixation durations on sentences is the following: Sentence 2 (M = 706.60, SD = 437.12), Sentence 1 (M = 650.43, SD = 371.83), and Sentence 3 (M = 372.36, SD = 261.83). Conovers’ post hoc test indicates significant differences between the average fixations on characters from Sentences 1 and 3 (p < 0.001) and Sentences 2 and 3 (p < 0.001), but there is no significant difference between Sentences 1 and 2 (p = 0.278).

The Wilcoxon matched-pairs signed rank test indicates that there is a significant difference in the average fixation duration on Sentence 2 between consistent and inconsistent problems (Table 4).

Table 4

Average fixation duration (milliseconds) on a character of different sentences

Sentence Consistent Inconsistent Z p
M SD M SD
Statement sentence (line 1) 607.90 382.80 650.43 371.84 −0.306 0.766
Relational sentence (line 2) 524.62 316.10 706.60 437.13 −3.388 0.001
Question (lines 3 and 4) 398.00 261.15 327.36 261.83 1.500 0.135

Comparing the average fixation duration on characters from different sentences in the case of successful and unsuccessful pupils, the Mann–Whitney U test indicates that the unsuccessful pupils fixate more on each sentence both in the case of CL problems and IL problems (Table 5).

Table 5

Average fixation duration (milliseconds) on a character from different sentences by performance

Type of items Parts of problems Successful solvers Unsuccessful solvers
M SD M SD df W p
Consistent items Statement sentence (line 1) 481.92 262.97 812.63 461.59 40 308.000 0.010
Relational sentence (line 2) 399.27 216.92 728.31 350.99 40 326.500 0.002
Question (lines 3 and 4) 293.73 159.51 567.44 307.46 40 328.000 0.001
Inconsistent items Statement sentence (line 1) 583.15 417.88 759.75 257.44 40 288.500 0.038
Relational sentence (line 2) 605.89 440.08 870.25 391.61 40 306.000 0.010
Question (lines 3 and 4) 333.46 269.54 435.56 243.74 40 288.000 0.039

In the case of successful solvers, Friedman test indicates significant differences between average fixation duration on different sentences both for consistent problems (χ 2 (2) = 9.500, p = 0.009) and inconsistent problems (χ 2(2) = 18.500, p < 0.001). The decreasing order of average fixation durations on sentences is Sentence 1, Sentence 2, and Sentence 3 in case of consistent problems, and Sentence 2, Sentence 1, and Sentence 3 in case of inconsistent problems (Table 4). In both cases, Conovers’ post hoc test indicates significant differences between the average fixations on characters from Sentences 1 and 3 (p = 0.008 for CL problems and p = 0.001 for IL problems) and Sentences 2 and 3 (p = 0.019 for CL problems and p < 0.001 for IL problems).

In the case of unsuccessful solvers, Friedman test indicates significant differences between average fixation durations on different sentences both for consistent problems (χ 2 (2) = 34.154, p < 0.001) and for inconsistent problems (χ 2 (2) = 29.154, p < 0.001). The decreasing order of average fixation durations on sentences is similar to in the case of successful solvers: Sentence 1, Sentence 2, and Sentence 3 in case of consistent problems and Sentence 2, Sentence 1, and Sentence 3 in case of inconsistent problems (Table 4). In both cases, Conovers’ post hoc test indicates significant differences between the average fixations on characters from Sentences 1 and 3 (p = 0.008 for CL problems and p = 0.001 for IL problems) and Sentences 2 and 3 (p = 0.019 for CL problems and p < 0.001 for IL problems). Additionally, in the case of CL problems, there is also a significant difference between average fixations on Sentences 1 and 2 (p = 0.016).

4.3 Fixation Duration on the Problems’ Data

In the following, the fixation durations of data are described. These are the relevant data from the problems that pupils must use to successfully solve the problem: three different numbers, relational words, reference words, and the names of the variables. The first number in the problems represents the value of the first variable, and the second number represents the difference between the first and second variables. The third number appears in the question, and this represents the quantity that must be taken from the second variable (i.e., the value of the second variable has to be multiplied by this number).

Total fixation durations on numbers from different items are presented in Table 6. The Friedman test indicates a significant difference between fixation durations on numbers from different items, χ 2 (3) = 26.486, p < 0.001. Pupils fixated the longest on numbers from Item 4 and based on the fixation duration on numbers from Item 4 is significantly higher than for numbers from Item 1 (p = 0.025) and from Item 3 (p < 0.001). Pupils fixated at least on numbers from Item 3, and the fixation duration on numbers from Item 3 is significantly shorter than on numbers from Item 1 (p < 0.001) and Item 2 (p = 0.020).

Table 6

Total fixation duration (milliseconds) on numbers by tasks

Items Minimum Maximum M SD
Item 1 (CL, unmarked) 658 36,750 7917.90 7313.66
Item 2 (IL, unmarked) 784 27,550 8229.86 5834.03
Item 3 (CL, marked) 1,100 23,658 5047.07 4587.54
Item 4 (IL, marked) 1,108 43,684 9554.07 7740.66

Comparing consistent and inconsistent problems, the Wilcoxon signed-rank test indicates a significantly longer fixation duration (W = 227.000 and p = 0.004) on numbers from IL problems (M = 17783.93) than on numbers from CL problems (M = 12964.98). This does not apply to consistent problems. The fixation duration on numbers takes 14.83% of the total fixation duration in the case of CL problems, and 19.41% in the case of IL problems.

Comparing average fixation duration on numbers from different sentences in the case of successful and unsuccessful solvers with the Mann–Whitney U test, the results indicate that in the case of consistent problems, there is a significant difference for Number 1 and Number 2, unsuccessful solvers fixated significantly longer on these numbers (Table 7). There are no significant differences between fixation duration on numbers by successful and unsuccessful solvers in case of inconsistent problems.

Table 7

Average fixation duration (milliseconds) on the numbers from different sentences by performance

Type of items Parts of problems Successful solvers Unsuccessful solvers
M SD M SD df W p
Consistent items Number 1 3247.23 1930.43 7285.81 5857.94 40 302.000 0.014
Number 2 3147.38 2690.63 6255.75 4670.28 40 301.000 0.015
Number 3 2527.96 1247.50 5992.31 5128.20 40 284.000 0.051
Inconsistent items Number 1 7187.57 6285.31 8498.25 4629.89 40 255.500 0.223
Number 2 3951.73 3162.57 5360.62 3137.95 40 279.000 0.068
Number 3 5006.80 3633.82 6586.50 4046.00 40 271.000 0.106

Comparing fixation durations on different numbers in the case of unsuccessful solvers, the Friedman test indicates that there is no significant difference in the case of CL problems, χ 2 (2) = 1.462, p = 0.482. In the case of IL problems, the Friedman test indicates that there are differences between fixation duration on different numbers, χ 2 (2) = 13.000, p = 0.002. The numbers from the relational sentences were significantly shorter time fixated than numbers from the statement sentence (p = 0.002) or from the question (p = 0.004). Making the same comparisons in the case of the successful solvers, the Friedman test indicates that there is no significant difference in the case of CL problems, χ 2 (2) = 3.875, p = 0.144. In the case of IL problems, the Friedman test indicates that there are differences between fixation duration on different numbers, χ 2 (2) = 7.125, p = 0.028, but the difference is only between Number 1 and Number 2, successful solvers fixated significantly shorter time the number from the relational sentence than the number from the statement sentence (p = 0.038).

In addition to the numerical data, other important data stand out from the context, which is essential for a successful solution: the name of the second variable in the case of Items 1 and 3, the reference word (which refers to the value of the first variable) in the case of Items 2 and 4, and the relational term for each item.

As regards the relational terms from the problems, two relational terms appeared: more (in Items 1 and 2) and less (in Items 3 and 4). The Wilcoxon signed-rank test indicates that pupils fixated significantly more (W = 290.000, p = 0.043) on the relational terms of the IL problems (M = 3974.21) than the CL problems (M = 3193.52). In addition, the Wilcoxon signed-rank test indicates that pupils fixated significantly more on the relational terms of the marked problems (M = 4024.29) than the unmarked problems (M = 3143.45), W = 289.000, p = 0.042. Analyzing separately the fixation durations on the relational terms by unsuccessful and successful problem solvers, the Mann–Whitney U test indicates that the only significant difference is in the case of unsuccessful solvers; they have fixated significantly more on the relational terms of the IL problems (M = 3461.77) than the CL problems (M = 2304.19), W = 92.000, p = 0.033.

The pronominal reference from the IL problems was significantly longer fixated by unsuccessful solvers (M = 2217.38) than by successful solvers (M = 1756.70), as indicated by the Mann–Whitney test (W = 286.000, p = 0.045).

5 Discussion and Conclusion

This study investigates the way 10–11 years old children read and interpret compare word problems using ET technology. However, this study is just a pilot study and further analyses and experiments are needed to gain solid evidence. Our results offer insight into the main problems arising in compare word problems, but more studies need to be done.

Previous studies have shown that pupils are less successful in solving IL problems than CL ones (Múñez et al., 2013; Orrantia & Múñez, 2013; Pape, 2003; Riley & Greeno, 1988), which is also supported by the results of the present research. While in the case of the CL problems, the success rate was 80.95%, the IL problems were solved correctly by half of the pupils. As we mentioned before, there are a few differences with respect to successful and unsuccessful pupils. Firstly, as Hegarty et al. (1995) pointed out, the solution time of low-ability pupils is longer than that of high-ability pupils. This is clearly true for present research as well; in addition, a negative correlation between performance and solution time can also be observed. Furthermore, the fact that low-ability pupils have more errors in the solution of IL problems than CL problems (Hegarty et al., 1995). However, it should be noted that mathematical ability levels do not necessarily coincide with problem-solving success on the test problems. Our study does not use external reverence of mathematical ability; thus, successfulness and problem-solving ability indicators were collected on the same test items.

From previous studies (De Corte et al., 1990; Hegarty et al., 1995), it is already known that there are longer fixations on IL problems, but it was not clear that it is also true for the numbers. Our study provides converging evidence concerning the first hypothesis that fixation duration on numbers is longer in the case of IL problems than in CL ones.

The second hypothesis is connected to the first one since it refers to longer fixation duration on the relational sentence in the case of IL problems in comparison to CL ones. In this regard, we should take into account that, in order to successfully solve an IL problem, it is necessary to reverse the relational term (Lewis & Mayer, 1987) and to process the meaning of the pronominal reference word, which refers to the first number (Hegarty et al., 1992). This may explain why pupils fixate on the whole relational sentence longer than in the case of CL problems. While in the case of CL problems, the relational sentence is built on the same schemata as the statement sentence, it is not valid for the IL problems: the order of the information is different. This aspect may influence the success rate, because pupils may tend to expect the same schemata in the second sentence, as well. This result indicates the evidence that translating the relational sentence generates problems among 10-year old pupils. It is obvious that this is a key step in this problem type, because if the pupils omit this step of the solving process, they cannot solve it correctly.

The third hypothesis is related to the pronominal reference word from the IL problems. The results show that there is a significant difference in the duration of fixations on this word between the successful and unsuccessful problem solvers group. This result confirms the third hypothesis, according to which unsuccessful problem solvers have longer fixation durations on the pronominal reference word in the case of IL problems compared to successful problem solvers. It suggests that unsuccessful solvers realized the importance of this information from the text, but they did not associate it with the first variable that this pronominal reference word actually refers to. Therefore, we can also conclude that, indeed, poor text comprehension leads to the incorrect solution.

The findings of our study need to be considered in light of a number of important limitations that are discussed below. First, assessing compare word problems with ET technology among young children has more challenges than with adults. We considered taking into account the age characteristics of the children, which avoided us from using filler tasks in the measurement. Since their attention span is much shorter than that of adults, too many tasks could also have caused data distortion due to lost focus. This may explain why previous studies in this field primarily assessed college pupils or secondary school pupils.

An additional observation is the fact that there was no warm-up task before the first task and the tasks were not randomly selected, so these may have influenced our results, especially regarding Item 1. As described above, the solution duration for Item 1 was unusually long compared to Item 3, both being consistent tasks. However, the fixation duration of the numbers in Item 1 was similar to the case of inconsistent tasks. Despite this, the performance on the first task was better than on Items 2 and 4. All these may lead to the conclusion that the above-mentioned weaknesses may distort our results.

Despite these limitations, the findings of the present study show how 10–11-year-old children are likely to misinterpret inconsistent word problems compared to consistent ones and what the main differences between successful and unsuccessful solvers are. However, the result also suggests that there are substantial individual differences in children’s solution process. We consider the present study to provide additional evidence that primary school pupils have difficulties solving compare word problems, and hopefully, our results contribute to further studies in this field.

Based on the results of the study, teaching methods regarding compare problems should be developed. Thus, exercises focusing on the context of the problem and graphical visualizations of the relation between variables facilitate pupils to concentrate more on other text elements besides numbers. Training pupils in comprehension skills is essential for solving compare word problems, as the result indicated: the source of the difficulty is not linked to the solution phase but to the comprehension phase that must be fostered.

  1. Funding information: The work of the first author was partially funded by the Collegium Talentum Programme of Hungary.

  2. Author contributions: The authors have equal contribution.

  3. Conflict of interest: The authors state no conflict of interest.

  4. Ethical approval: As the participants of the study were 10–11-year old pupils, parental consent was obtained for inclusion in the research. The data collected were analyzed anonymously and were stored securely and confidentially.

  5. Data availability statement: The dataset is available from the corresponding author on reasonable request.

References

Andersson, U. (2007). The contribution of working memory to children’s mathematical word problem solving. Applied Cognitive Psychology, 21(9), 1201–1216. doi: 10.1002/acp.1317.Search in Google Scholar

Blum, W., & Leiß, D. (2007). How do students and teachers deal with modelling problems? In Mathematical modelling (pp. 222–231). Kassel, Germany: Elsevier. doi: 10.1533/9780857099419.5.221.Search in Google Scholar

Boonen, A. J. H., de Koning, B. B., Jolles, J., & van der Schoot, M. (2016). Word problem solving in contemporary math education: A plea for reading comprehension skills training. Frontiers in Psychology, 7, 1–10. doi: 10.3389/fpsyg.2016.00191.Search in Google Scholar

Boonen, A., & Jolles, J. (2015). Second grade elementary school students? differing performance on combine, change and compare word problems. International Journal of School and Cognitive Psychology, 2(2), 1–6. doi: 10.4172/2469-9837.1000122.Search in Google Scholar

Csíkos, C., & Steklács J. (2015). Relationships between student performance on arithmetic word problems eye-fixation duration variables and number notation: Number words vs arabic numerals. Mediterranean Journal For Research In Mathematics Education, 14, 43–57.Search in Google Scholar

Daroczy, G., Meurers, D., Heller, J., Wolska, M., & Nürk, H. C. (2020). The interaction of linguistic and arithmetic factors affects adult performance on arithmetic word problems. Cognitive Processing, 21(1), 105–125. doi: 10.1007/s10339-019-00948-5.Search in Google Scholar

Daroczy, G., Wolska, M., Meurers, W. D., & Nuerk, H. C. (2015). Word problems: A review of linguistic and numerical factors contributing to their difficulty. Frontiers in Psychology, 6, 348. doi: 10.3389/fpsyg.2015.00348.Search in Google Scholar

De Corte, E., Verschaffel, L., & Pauwels, A. (1990). Influence of the semantic structure of word problems on second graders’ eye movements. Journal of Educational Psychology, 82(2), Article 2. doi: 10.1037/0022-0663.82.2.359.Search in Google Scholar

González-Calero, J. A., Berciano, A., & Arnau, D. (2020). The role of language on the reversal error. A study with bilingual Basque-Spanish students. Mathematical Thinking and Learning, 22(3), 214–232. doi: 10.1080/10986065.2020.1681100.Search in Google Scholar

Hegarty, M., Mayer, R. E., & Green, C. E. (1992). Comprehension of arithmetic word problems: Evidence from students’ eye fixations. Journal of Educational Psychology, 84(1), Article 1. doi: 10.1037/0022-0663.84.1.76.Search in Google Scholar

Hegarty, M., Mayer, R. E., & Monk, C. A. (1995). Comprehension of arithmetic word problems: A comparison of successful and unsuccessful problem solvers. Journal of Educational Psychology, 87(1), Article 1. doi: 10.1037/0022-0663.87.1.18.Search in Google Scholar

Lewis, A. B., & Mayer, R. E. (1987). Students’ miscomprehension of relational statements in arithmetic word problems. Journal of Educational Psychology, 79(4), 363–371. doi: 10.1037/0022-0663.79.4.363.Search in Google Scholar

Múñez, D., Orrantia, J., & Rosales, J. (2013). The effect of external representations on compare word problems: Supporting mental model construction. The Journal of Experimental Education, 81(3), 337–355. doi: 10.1080/00220973.2012.715095.Search in Google Scholar

Ng, C. T., Lung, T. C., & Chang, T. T. (2021). Operation-specific lexical consistency effect in fronto-insular-parietal network during word problem solving. Frontiers in Human Neuroscience, 15, 1–11. https://www.frontiersin.org/articles/. doi: 10.3389/fnhum.2021.631438.Search in Google Scholar

Orrantia, J., & Múñez, D. (2013). Arithmetic word problem solving: Evidence for a magnitude-based mental representation. Memory & Cognition, 41(1), 98–108. doi: 10.3758/s13421-012-0241-1.Search in Google Scholar

Pape, S. J. (2003). Compare word problems: Consistency hypothesis revisited. Contemporary Educational Psychology, 28(3), 396–421. doi: 10.1016/S0361-476X(02)00046-2.Search in Google Scholar

Powell, S. R., & Fuchs, L. S. (2014). Does Early Algebraic Reasoning Differ as a Function of Students’ Difficulty with Calculations versus Word Problems?: Algebraic reasoning of struggling students. Learning Disabilities Research & Practice, 29(3), 106–116. doi: 10.1111/ldrp.12037.Search in Google Scholar

Riley, M. S., & Greeno, J. G. (1988). Developmental analysis of understanding language about quantities and of solving problems. Cognition and Instruction, 5(1), 49–101. doi: 10.1207/s1532690xci0501_2.Search in Google Scholar

Riley, M. S., Greeno, J. G., & Heller Joan, I. (1983). Development of children’s problem-solving ability in arithmetic. In H. Ginsburg (Ed.), The development of mathematical thinking (pp. 153–196). New York: Academic Press.Search in Google Scholar

Stephany, S. (2021). The influence of reading comprehension on solving mathematical word problems: A situation model approach. In A. Fritz, E. Gürsoy, & M. Herzog (Eds.), Diversity dimensions in mathematics and language learning (pp. 370–395). Berlin, Boston: De Gruyter. doi: 10.1515/9783110661941-019.Search in Google Scholar

van der Schoot, M., Bakker Arkema, A. H., Horsley, T. M., & van Lieshout, E. C. (2009). The consistency effect depends on markedness in less successful but not successful problem solvers: An eye movement study in primary school children. Contemporary Educational Psychology, 34(1), 58–66. doi: 10.1016/j.cedpsych.2008.07.002.Search in Google Scholar

Verschaffel, L., & De Corte, E. (1993). A decade of research on word problem solving in Leuven: Theoretical, methodological, and practical outcomes. Educational Psychology Review, 5(3), 239–256. doi: 10.1007/BF01323046.Search in Google Scholar

Received: 2022-11-18
Revised: 2023-08-22
Accepted: 2023-11-01
Published Online: 2023-11-28

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Special Issue: Transforming Education in the COVID-19 Era
  2. Digital Learning Ecosystem: Current State, Prospects, and Hurdles
  3. Special Issue: Building Bridges in STEAM Education in the 21st Century - Part I
  4. STEMbach Experiences at Higher Education
  5. Poly-Universe Resource for Solving Geometric Tasks by Portuguese Basic Education Students
  6. Automatic Exercise Generation for Exploring Connections between Mathematics and Music
  7. “Literally I Grew Up” Secondary–Tertiary Transition in Mathematics for Engineering Students beyond the Purely Cognitive Aspects
  8. Narrative Didactics in Mathematics Education: Results from a University Geometry Course
  9. Solving Authentic Problems through Engineering Design
  10. Using STEAM and Bio-Inspired Design to Teach the Entrepreneurial Mindset to Engineers
  11. Escape Rooms for Secondary Mathematics Education: Design and Experiments
  12. Towards a Pedagogical Model Applying Commedia dell’Arte and Art Workshops in Higher Education Design Studies
  13. A Pilot Study on Investigating Primary School Students’ Eye Movements While Solving Compare Word Problems
  14. Utilising a STEAM-based Approach to Support Calculus Students’ Positive Attitudes Towards Mathematics and Enhance their Learning Outcomes
  15. Regular Articles
  16. Motivators for University of Professional Studies Accra Students to Adopt a Learning Management System in Ghana
  17. Self-Confidence and STEM Career Propensity: Lessons from an All-Girls Secondary School
  18. “Tis Early Practice only Makes the Master”: Nature and Nurture in Economic Thinking During School Time – A Research Note on Economics Education
  19. Commuter Students and Psychological Sense of Community: How Ties to Home Relate to Academic Success
  20. International Students’ Experience of Remote Teaching and Learning in Portugal
  21. Exploring the Validity of a Single-Item Instrument for Assessing Pre-Service Primary School Teachers’ Sense of Belonging to Science
  22. Barriers to Basic School Teachers’ Implementation of Formative Assessment in the Cape Coast Metropolis of Ghana
  23. The Impact of Organizational Climate on Teacher Enthusiasm: A Two-Staged Structural Equation Modelling–Artificial Neural Network Approach
  24. Estimation of GPA at Undergraduate Level using MLR and ANN at Arab International University During the Syrian Crisis: A Case Study
  25. Research is for Hunters, Teaching for Farmers. Investigating Solutions to Lecturer-Related Problems of the Teaching–Research Mission of Swiss Universities of Applied Sciences
  26. Strategic Performance Management Using the Balanced Scorecard in Educational Institutions
  27. Reciprocal Teaching as a Cognitive and Metacognitive Strategy in Promoting Saudi University Students’ Reading Comprehension
  28. The Effects of Learning Design on Learning Activities Based on Higher Order Thinking Skills in Vocational High Schools
  29. Estimating the Returns to Education Using a Machine Learning Approach – Evidence for Different Regions
  30. Conceptualizing and Reimagining the Future of Inclusive Education in the UAE
  31. Transformative Assessment Practices in Mathematics Classes: Lesson from Schools in Jimma, Ethiopia
  32. Teacher’s Constraints and Challenges in Implementing Student Attitude Assessment in Junior High School
  33. Pedagogical Design as a Tool to Increase Students’ Learning Motivation During Distance Learning
  34. The Effectiveness of Online Problem-Based Learning Tasks on Riyadh’s Secondary School Students’ Problem-Solving Ability and Programming Skills
  35. Review Articles
  36. Underlying Educational Inequalities in the Global and Fijian Context
  37. Challenges and Emerging Perspectives of Quality Assurance and Teacher Education in Nigerian Universities: A Literature Review
Downloaded on 19.1.2026 from https://www.degruyterbrill.com/document/doi/10.1515/edu-2022-0207/html
Scroll to top button