Startseite Analysis of the sports action recognition model based on the LSTM recurrent neural network
Artikel Open Access

Analysis of the sports action recognition model based on the LSTM recurrent neural network

  • Ping Chen und Jiangui Peng EMAIL logo
Veröffentlicht/Copyright: 25. Februar 2025
Veröffentlichen auch Sie bei De Gruyter Brill

Abstract

With the rapid growth of motion data, the traditional motion recognition algorithm is faced with the problem of insufficient processing ability. To solve this problem, a method based on gradient descent optimization (GDO)–long short-term memory (LSTM) is proposed to meet the needs of sports action recognition. Based on the experiment of sports data set of students in Hainan University, the experiments of skipping rope, swimming, skating, and shotput were carried out extensively. The total number of experiments were 77, 94, 72, and 85. The experimental results show that the accuracies of GDO-LSTM in sports action recognition were 98.7, 100, 100, and 94.1%, respectively, which was superior to that of the three-axis gyroscope (80.5, 40.4, 23.6, and 100%). These results show that the algorithm can effectively improve the accuracy of sports action recognition and has wide application potential.

Notations

CLSTM

convolutional long short-term memory

CNN

convolutional neural network

GDO

gradient descent optimization

KNM

kernel norm minimization

LSTM

long short-term memory

MSE

mean squared error

PSO

particle swarm optimization

SDPA

scaled dot-product attention

SVM

support vector machine

Tanh

hyperbolic tangent

1 Introduction

In recent years, with the widespread adoption of the Internet and smart devices, the speed of motion data generation has increased exponentially. People are paying increasing attention to health and fitness, and the demand for motion recognition technology is increasing. Motor action recognition can not only help individuals monitor their athletic performance but also provide a scientific basis for fitness coaches and even plays a key role in rehabilitation treatment and sports training [1,2]. Especially after the epidemic, the rise of home fitness and the popularity of intelligent sports equipment make people’s demand for sports data analysis increasingly strong. Although the number of motion recognition algorithms is increasing gradually, there are still several problems. Due to the diversity and complexity of motion data, it is difficult for a large number of existing algorithms to effectively process the newly generated data, resulting in greatly reduced information value. Traditional motion recognition methods often rely on manual feature extraction, which is easy to be affected by noise and external interference, so the accuracy of recognition results is not high. In the face of individual differences and different motion modes, the existing models usually show poor generalization ability and cannot adapt to the needs of motion recognition in various situations [3,4]. In this context, a long short-term memory (LSTM) network is widely used in motion recognition because of its excellent time series modeling ability. However, LSTM still has some limitations when dealing with noise and unbalanced data. Therefore, to improve the demand for sports health management and solve the bottleneck of the current action recognition algorithm, the introduction of the gradient descent optimization (GDO) method can effectively improve the algorithm’s processing ability of abnormal data. A GDO-LSTM algorithm based on GDO and LSTM is constructed, which not only increases the structure of weight processing in LSTM but also deepens the scaled dot-product attention (SDPA) in the time attention module, which can avoid the problems in the process of motion. The accuracy of motion recognition is tested by combining four common movements. The aim is to use large amounts of data to train a model that can accurately judge movements, which can help companies not only meet the needs of users but also meet the needs of people in areas such as fitness. The main contribution and discovery of the research is that the advantages of GDO and LSTM networks are combined to significantly improve the effect of motor recognition. It provides a new idea for the future motion detection and analysis system in wearable devices, which helps to promote the development of intelligent fitness and health management. In future practical applications, the research results will support the development of smarter fitness applications, providing personalized exercise advice and feedback to help users achieve fitness goals more effectively. At the same time, it is helpful to improve the training strategy, reduce the risk of injury, and improve the overall competitive level of athletes. Combining social media functions, the development of an interactive fitness platform based on movement recognition can promote communication and motivation among users, thus enhancing their motivation to keep moving.

2 Related works

A very important branch of video processing technology, namely the recognition of human sports actions, has a very important role in physical health, training needs, etc. Xu and Yan analyzed a sports action recognition system based on clustering regression to improve the depth network in order to improve the recognition rates of athletes in sports. Through literature research, they chose a neural network as the basis of the algorithm. The shortcomings of the traditional neural network were also analyzed, and the traditional neural network was improved by combining the sports recognition needs of sports athletes. The network data collection method was used to build a video library of sports players’ movements, and basketball events were analyzed for recognition by feature judgment. The study shows that the recognition rate of basketball action is greatly improved compared with the traditional algorithmic model, and the results validate the significant effectiveness of their proposed improved deep network in the field of human behavior recognition research [5]. Sarabu and Santra proposed a dual-stream network with two convolutional neural networks (CNNs) and convolutional long short-term memory (CLSTM). They used a pre-trained ImageNet model to extract spatio-temporal features using two CNNs. The results of the two CNNs were then combined and fed to the CLSTM to obtain the total classification score. They explored the effect of the fusion function performance feature mapping of the two CNNs and derived the optimal fusion function and number of layers. To avoid the problem of overfitting, they used a data expansion technique. The experimental results show that their proposed model shows substantial improvement over current dual-stream methods [6]. de Albuquerque et al. proposed a bidirectional LSTM-based attention mechanism with an extended CNN that selectively focuses on the effective features in the input frames to recognize different human actions in the video. In this diverse network, they extracted significant distinguishing features and updated features that retain more information than shallow layers by using residual blocks. They fed these features into learning long-term dependencies, followed by attention mechanisms to improve performance and extract additional high-level selective action-related patterns and cues. The experiments were tested at UCF Sports with recognition rates of 98.3, 99.1, and 80.2%, an improvement of 1–3% over previous methods [7].

Hu et al. proposed an attention-based multi-level co-occurrence graph convolution, which is able to utilize body structure information from the skeleton and enhance multi-level co-occurrence feature learning by integrating a graph convolution network into the LSTM. They designed spatial attention modules for feature enhancement of key joints of the skeleton input. They designed multi-level co-occurrence memory units to automatically model the spatial relationships between joints while capturing co-occurrence features from different joints, people, and frames. Experimental results show that their model significantly outperforms mainstream methods on the interactive subset of the data set [8]. Zhu et al. designed a spatio-temporal dual-attention network, which mainly consists of feature extraction, attention, and fusion modules. Unlike the high-level fully connected layer features mainly used in previous studies, this work extracted the convolutional and fully connected layer features of CNNs to enrich the initial level of video representation. In addition, a temporal attention module and a joint temporal attention module were implemented to enhance the spatiotemporal attention capability. The potential was effectively mined and weighted by principal component analysis and feature fusion. The experimental results show that the method has better recognition performance compared with the existing methods [9]. Naveenkumar and Domnic’s recurrent neural network-based approach focused on the temporal evolution of body joints and ignored the geometric relationships between them. This led to the proposal of 11 quadrilaterals to capture the geometric relationships between joints for action recognition. An end-to-end three-layer bidirectional LSTM network was designed as the base net to learn the robust representation. Two subnets based on the base network were used to extract differentiated spatiotemporal features, the first one using four spatial features and the second one using two temporal features. Experimental results show that their method achieves state-of-the-art performance compared to recent methods [10]. Furnari and Farinella addressed the problem of self-centered action anticipation, i.e., predicting the actions taken by camera wearers and which objects they will interact with. They contributed to rolling unfolding LSTM, a learning architecture for predicting behavior in self-centered videos. The approach models the subtasks of summarizing the past and inferring the future, sequence completion pre-training techniques, and modal attention mechanisms to effectively blend by processing frames. Multimodal prediction was performed by flow fields and object-based features. The method was validated, and experiments showed that the proposed architecture was state-of-the-art in the field of self-centered video; the method also achieved competitive performance relative to methods that are not based on unsupervised pre-training [11].

In the above research, most algorithms have some limitations in the field of motion recognition, such as strong dependence on feature extraction, poor generalization ability, and limited data processing ability. The method adopted in this study can automatically extract effective features from a large number of data using the LSTM network. The ability of the model to handle unbalanced and noisy data is enhanced by GDO. Therefore, the proposed method can effectively fill the gaps in the field of motion recognition.

3 LSTM recurrent neural network-based sports action information recognition processing system

3.1 Construction of LSTM-based human motion recognition information acquisition and processing system

In order to recognize the sports action of the human body, the feedback of the system has to be considered, and the recognition results have to be timely. Then, the research based on the system will be different from the monitoring system of health, fall, etc. The architecture of other systems will only propose three layers, and the processing system in motion recognition is relatively complete [12]. Moreover, the recognition of the human body’s sports action relatively only needs two layers because the data application in the extraction has been handled better, the missing layer that is the application of the data, as shown in Figure 1.

Figure 1 
                  Motion processing architecture.
Figure 1

Motion processing architecture.

In Figure 1, in the data collection layer, the placement of motion sensors is designed to collect human motion data, including acceleration and angular velocity. The collected data can go to the data pre-processing, where the initial cleaning is done to get the formal data. The data computation layer is able to create and store the data in a database, which is the core of the system [13]. Finally, at the computation layer, the data are further processed according to the requirements to output meaningful data of the motion type and re-store the labels. Once the model computing layer produces meaningful data, there will be situations where the human body is outside its typical range of motion, which can lead to significant data fluctuations and the appearance of extreme values. The appearance of extreme values is not only lossy to the instrument, but also yields inaccurate results. The data need to be normalized as in Eq. (1):

(1) x = x x min x max x min = x x ¯ σ .

After performing the normalization process as in Eq. (1), a set of data compressed to the range of 0–1 can be obtained, and such an operation can also avoid the appearance of outliers σ . However, to obtain the transformed data, feature capture is required. The feature capture includes both the maximum and minimum values as well as the standard deviation, which is calculated as in Eq. (2):

(2) max = max ( a i ) , i { 1 , 2 , 3 , } std = 1 n i = 1 n ( a i mean ) 2 min = min ( a i ) , i { 1 , 2 , 3 , } .

In Eq. (2), “mean” refers to the mean of the data, and the standard deviation “std” is calculated to measure the dispersion of the data from the mean. The skewness and kurtosis of the data can also be found by using Eq. (2), but the operation of the support vector machine (SVM) can only be done using the Euclidean distance. The core of SVM is to obtain a super-shared plane, which can not only solve the longitude catastrophe that often occurs in traditional algorithms but also avoid the local overdose problem [14]. According to the Lagrangian multiplier, it can be transformed while converting the original to the dual, and the dual Lagrangian function is as follows:

(3) L ( w , b , a ) = 0.5 w 2 i = 1 n a i ( y i ( w x i + b ) 1 ) ,

where 0.5 w is the distance between the support plane w x i + b and the hyperplane is obtained when w x i + b = 0 . This Lagrangian function can be derived, and only partial derivatives can be found. When the derivative function takes the value 0, the intermediate state function is obtained. The intermediate state function can be substituted into Eq. (3) to obtain Eq. (4):

(4) W ( a ) = i = 1 n a i 0.5 i , j = 1 n a i a j x i x j y i y j s , t , a i 0 , i = 1 , 2 , 3 , , n i = 1 n a i y i = 0 .

However, Eq. (4) is only applicable to linearly divisible cases. In practice, the existence of linearly indivisible cases should be considered. Then, kernel norm minimization (KNM) can be applied. KNM can restore the observation matrix to the original matrix, as shown in Eq. (5):

(5) X ¯ ¯ = 0.5 arg min x Y F F 2 + λ X ,

where X is the minimized kernel parametrization, the observation matrix is X , the original matrix of X is Y , and the estimated approximation of X is referred to as Y . The threshold value of X is denoted as λ , and the diagonal elements are solved as in Eq. (6):

(6) S λ i i = max i i λ , 0 ,

where i i represents the soft threshold function of Y in Eq. (5), which can be applied to the LSTM. The LSTM network structure has output gates, forgetting gates, and cell states. The cell structure can record the time series and iteratively update them by summing. It not only avoids the large impact between data but also preserves the state of the time step [15]. Because of the variety of gates, the gradient does not explode when the weights are updated, as shown in Figure 2.

Figure 2 
                  LSTM flow chart.
Figure 2

LSTM flow chart.

From Figure 2, after the input data enter the input gate, the qualified data are outputted after strict screening by the cell to forget the unqualified data. Cell screening is the most important process, the core of which lies in the decoder. The decoder is able to focus on the aggregated historical information based on time by fading and predicting attention [16]. The decoder is based on machine translation, and the end of each decoding is the beginning of the next decoding and the decoding principle is as in Eq. (7):

(7) p y = t = 1 n p ( y 1 { y 1 , y 2 , , y t 1 } , C ) ,

where the output decoding result is denoted as y 1 { y 1 , y 2 , , y t 1 } , C . From Eq. (7), its form is similar to Eq. (3) to a large extent, and the two can be combined to obtain a more optimal solution.

3.2 Improving LSTM by gradient descent in sports information acquisition system

For LSTM, although it had experienced a low period in the last century, people gradually found that the traditional algorithm needed to go through manual feature extraction, and certain fields could not be involved. With the development of gated recurrent unit (GRU) in computing efficiency, LSTM has been more widely used in image recognition due to its ability to mine time series data. LSTM is like the human nervous system, which consists of many neurons. This research is based on the variant of LSTM, in which the structure of weight processing is added among the structure of each individual neuron, which can avoid the problem of excessive weights in the working process of LSTM so that LSTM can be more fully exploited, as shown in Figure 3.

Figure 3 
                  Improved LSTM process.
Figure 3

Improved LSTM process.

From the process in Figure 3, at moment t, the hidden layer accepts the node M_t and the input value X_t, which are input to the weight layer after the calculation of the hidden layer and are processed after merging with the node M_1 and the input value X_1 accepted by the weight layer at moment t 1. The processed weight information is input to the memory unit, which is able to identify some elements that should be forgotten, some elements that should be output, and the remaining elements that can be recovered by valves. One of the most widespread applications in the memory unit is the Sigmond function, as given in Eq. (8):

(8) σ ( x ) = 1 1 + e x ,

where σ ( x ) represents the Sigmond function. When the Sigmond function is used as the activation function, the function image is characterized by a small value-added and degraded value. Since the problem studied is one with a high number of classifications, the loss function used is preferably categorical cross-entropy, as in Eq. (9):

(9) L i = j t i , j log ( p i , j ) ,

where the actual value is denoted as t , and the predicted values are called p . i , j are used to represent the classification where the actual and predicted data are located, respectively. When using categorical cross-entropy as the loss function, mean squared error (MSE) and cross-entropy (H) are operated as in Eq. (10):

(10) MSE = 1 n i = 1 n ( y i y i ) 2 , H ( p , q ) = i p ( i ) × log 1 q i ,

where y i , y i are given as the predicted and actual values of the model, respectively, and their values are always greater than or equal to 0. The model will be more accurate if it is optimized several times by the GDO method to keep it close to 0. GDO can optimize MSE and H by iteration, as in Eq. (11):

(11) θ = θ α J ( θ ) θ ,

where the step size is represented using θ and J ( θ ) represents the objective function. MSO and H are able to minimize them and make the weights more appropriate after passing the GDO process. In the module of data pre-processing, sports are different from normal working life because of their elegance and beauty. Sports, because of their elegance and beauty, are always accompanied by great risks, so they are optimized for the temporal attention module [17]. The temporal attention module focuses on the SDPA with the following equation:

(12) Q = X W q , K = X W k , V = X W V ,

where the modeling query vector is represented by Q , and the meaning of K , V is to store the vector. From Eq. (12), three eigenmatrices of the input quantity X can be derived, and these three eigenmatrices are subjected to a scaling operation. Then, the normalization should be performed, and the normalization method is as in Eq. (13):

(13) Y = Attention ( Q , K , V ) = Soft max Q K T d m V , Z = Y W Z + X ,

where Y represents the weighted feature matrix of Y , and the meaning of Z is the feature expression of X , Y obtained by residual concatenation. For the aggregation and expression of historical information, it is divided into two parts, which are query and storage. Relating the present moment to some future moment, the two parts of information are fed into the vector for representation, and the two features will no longer be independent. Since the input sequence tends to lose a part of the information, an attention mechanism is introduced, as in Eq. (14):

(14) PE ( pos , i ) = sin ( pos / 10 , 000 i / dm ) even cos ( pos / 10 , 000 i / dm ) otherwise,

where the feature mapping is denoted as pos , and the position vector generated by pos is d pos . The data set generated using the attention mechanism is able to preserve the temporal order and also to extract complex action data from sports. This allows us to improve the LSTM (GDO-LSTM) based on the attention mechanism, as shown in Figure 4.

Figure 4 
                  Improved sensor diagram.
Figure 4

Improved sensor diagram.

In Figure 4, the sensor collects an assortment of data sets during its operation, using a random data set, which is fed into a modified LSTM input gate. After going through the LSTM process once, it will go through a specially designed cell structure. The data that have undergone the cell structure are only expanded for their Lagrangian function, and then a hyperbolic tangent (Tanh) process is performed to judge whether the result should be forgotten or not. If yes, they are temporarily forgotten and Lagrangianized once more; if not, they are transported to the output gate and output as a qualified result.

4 Model creation based on the GDO-LSTM algorithm in sports action recognition

4.1 GDO-LSTM sports monitoring system development and model parameter determination

This study selected a data set from Hainan University and selected 1,000 data items as the training set to train the model. The performance of the model was judged by the indicators of accuracy and error rate. In the recognition of sports movements, a three-axis magnetometer was used to calibrate sports movements. In the experiment, the specific hardware and parameters are shown in Table 1.

Table 1

Experimental parameters

Graphics processing unit Internal storage Operating system Internal storage
NVIDIA Tesla M60 2T*2 128Ubnutu 21.02.20 512 G
Flash memory Operator Database Operating system
CUDA Too kit 23.0 Python 6 Mysql 5.20.2023 Ubuntu 88.64
Web development framework Language Display card CPU
Django 1.22.3 Easy Chinese 6.0 GHz Intel penta-core

The collected data set required further processing to enable the studied algorithm to gain learning. For the processing of the data set, GDO-LSTM was used for iterative optimization. To verify its accuracy, a comparison with the traditional particle swarm optimization algorithm (PSO) was performed, and two images were obtained: the accuracy-training set image and the error rate-training set image, as shown in Figure 5 [18].

Figure 5 
                  Comparison of (a) accuracy-training set image and (b) error rate-training set image.
Figure 5

Comparison of (a) accuracy-training set image and (b) error rate-training set image.

From Figure 5, with the increase in the number of iterations, the accuracy of GDO-LSTM was lower than that of PSO until 200 iterations, and the error rate was higher than that of PSO. However, after the number of iterations reached 200 or more, the accuracy of GDO-LSTM was able to reach 1.1–1.5 times that of PSO. Although increasing the number of iterations increased the cost, the role of accuracy was crucial considering the scientific rigor. After the learning of the GDO-LSTM algorithm was completed, it had to be considered for the wearing of the motion sensor during the test. The motion sensor would measure three sets of data during the test: the acceleration of the object in the three axes of motion. The algorithm was applied to build a three-axis magnetometer and calibrate it using sit-to-stand and lie-flat, and the calibration diagram is shown in Figure 6.

Figure 6 
                  Calibration of the three-axis magnetometer. (a) Sitting correction of three-axis magnetometer and (b) flat-lying correction of triaxial magnetometer.
Figure 6

Calibration of the three-axis magnetometer. (a) Sitting correction of three-axis magnetometer and (b) flat-lying correction of triaxial magnetometer.

From Figure 6, after the change from sitting to lying position, the acceleration of the X- and Y-axis would decrease due to the change in position, and the acceleration of the Z-axis would increase instead of decrease because of the change in height. After zeroing, the three-axis magnetometer had to be tested for sensitivity, and it was clear that there was a significant difference between it at rest and during weight lifting, as shown in Figure 7 [19].

Figure 7 
                  Three-axis acceleration of static weight lifting. (a) Static acceleration–time and (b) weight lifting acceleration–time.
Figure 7

Three-axis acceleration of static weight lifting. (a) Static acceleration–time and (b) weight lifting acceleration–time.

From Figure 7, the X-, Z-, and Y-axis acceleration were from top to bottom when Figure 7(b) was stationary. The Z-axis acceleration was very close to 0 because no change in height occurred. The Y-axis acceleration hardly changed during the process from rest to the highest point of the lift, which indicated that the tester did not move forward or backward. However, the X- and Z-axis accelerations changed considerably, with the mean value of the Z-axis above the X-axis, due to the fact that the weight lifting changed more for the up and down accelerations than for the left and right. By the moment of 97 s, when the weight lifting reached its highest point, the Z-axis acceleration reached its maximum, so defining such curves attributed to the weight lifting. The data processed by the oblivion gate could be loaded into the three-axis magnetometer to generate a baseline image for reference, as shown in Figure 8.

Figure 8 
                  The base image of the three-axis magnetometer.
Figure 8

The base image of the three-axis magnetometer.

4.2 Experimental validation of the LSTM-GDO algorithm for sports action recognition

The calibration and sensitivity testing of a three-axis gyroscope, which works by measuring angular velocity and thus estimating the state of motion, is similar to that of a three-axis magnetometer. To be able to have objective sensor test results for sensor accuracy comparison, this study also explored the comparison of the accuracy of the three-axis magnetometer and gyroscope tests. It was first evaluated for simple motion, as shown in Figure 9.

Figure 9 
                  Comparison between the three-axis magnetometer and gyroscope in skipping rope.
Figure 9

Comparison between the three-axis magnetometer and gyroscope in skipping rope.

From Figure 9, for the three-axis magnetometer, the changes in the X- and Y-axis were very regular because the jump rope was repeating the same action. The changes in the Z-axis were not obvious because the jumping height was different, which was in line with the jump rope characteristics of the human body. For the three-axis magnetometer, because the angular velocity did not change much when jumping rope, it was reflected in the image that the measurement data did not change obviously, which led to the recognition effect as the action was not very good. To study the motion recognition of three complex motions for the sake of extensive experiments, in Figure 10, the curve was selected with the most obvious change when using the three-axis magnetometer and gyroscope test each time to analyze the accuracy of the two test instruments.

Figure 10 
                  Extensive accuracy test of the three-axis magnetometer and gyroscope.
Figure 10

Extensive accuracy test of the three-axis magnetometer and gyroscope.

From the extensive tests of the accuracy of the three-axis magnetometer and gyroscope in Figure 10, in the skating motion recognition, both instruments performed very well. In contrast, among the recognition of swimming and shot put, the test accuracy of the three-axis magnetometer was obviously inferior to that of the three-axis gyroscope because the angular velocity changes in these two movements were not as obvious as the acceleration. To be able to analyze intuitively, the matrix shown in Figure 11 was established to be able to analyze more clearly the effect of the three-axis magnetometer and gyroscope in practical applications.

Figure 11 
                  Three-axis magnetometer and gyroscope test matrix. (a) Extensive test matrix of three-axis dynamometer and (b) extensive test matrix of three-axis gyroscope.
Figure 11

Three-axis magnetometer and gyroscope test matrix. (a) Extensive test matrix of three-axis dynamometer and (b) extensive test matrix of three-axis gyroscope.

The matrix in Figure 11 shows the recognition ability of the three-axis magnetometer and gyroscope. The test effect of the three-axis gyroscope was not very good, and there was a lot of confusion in the identification of swimming and skating. The three-axis magnetometer was accurate in the identification of four common sports most times, but there was some confusion between rope skipping and swimming, as well as shot put and skating, but it was enough to meet the equipment accuracy requirements.

5 Conclusion

To identify sports actions quickly and accurately to cope with the increasing sports action data, this study generated a fusion algorithm GDO-LSTM based on GDO and LSTM. MSE and H of the data were processed using GDO, and Lagrange multipliers and KNM were applied to process linearly separable and non-separable data, respectively. Weight processing was added to the LSTM to prevent weight explosion. By comparing with PSO iterations, it was concluded that 200 iterations of the GDO-LSTM were needed. Zeroing and sensitivity tests were required before using the three-axis magnetometer and gyroscope for detection. In the sensitivity test of the three-axis magnetometer, the Z-axis acceleration was 0 for sitting and standing, the X-axis acceleration was 0.25, and the Y-axis acceleration fluctuated from −0.25 to −0.75 because of the presence of gravity. Weight lifting was selected as the sensitivity detection, and the X-axis acceleration reached a maximum value of 0.47 at the moment of 97 s, and the Z-axis acceleration reached a minimum value of −0.25 at the moment of 45 s. There was a probability of change in time or value, but the image characteristics were unique. A three-axis gyroscope was selected in the experiment to test the recognition accuracy of four common sports actions and compared it with a three-axis magnetometer. The experiments show that the three-axis magnetometer has the worst recognition effect for ice skating action: it was recognized accurately only 17 times, was recognized as swimming 27 times, was recognized as throwing the lead ball 24 times, and was recognized as jumping rope 4 times. The best recognition result was achieved for lead throwing, and all 85 recognition results were accurate. Because of the change in the angular velocity associated with the lead ball drop, the three-axis dynamometer showed slight deviation, with five times being identified as ice skating. For the three-axis gyroscope, the sports action similar to jumping rope was recognized as skating ten times, but the accuracy rate was still very high. However, the recognition of swimming was not good: it was recognized accurately only 38 times, was recognized as skipping rope 20 times, and was recognized as throwing the lead ball 31 times. The recognition accuracies of the three-axis dynamometer for the four sports actions of jumping rope, swimming, skating, and throwing the shot put were 98.7, 100, 100, and 94.1%, respectively, while the accuracies of the three-axis gyroscope were 80.5, 40.4, 23.6, and 100%, respectively. The above results show that the GDO-LSTM can effectively identify multiple human movements while maintaining a high level of accuracy. The algorithm can deal with data fluctuation more robustly and reduce the accuracy decline caused by environmental factors. It has great potential in practical application scenarios such as intelligent fitness and sports training monitoring. The proposed method can provide users with timely analysis of sports performance, effectively help users adjust sports strategies, improve training results, and play an important role in the field of motor action recognition. Although there are some deviations in the action recognition of jumping rope and throwing shotput, it can already meet the user requirements. The reason for the lack of research on sports movement data is that sports movement data are private and not suitable for widespread dissemination. With more volunteers, it is believed that future studies can be improved.

  1. Funding information: The authors state no funding involved.

  2. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and consented to its submission to the journal, reviewed all the results and approved the final version of the manuscript. P.C.: wrote the original draft, participated in the literature search and analyses, evaluations and manuscript preparation, and wrote the paper. J.P.: conceived and designed the manuscript, interpreted the data, and participated in project administration, including resources, software, validation, visualization, conceptualization, investigation, and methodology.

  3. Conflict of interest: The authors state no conflict of interest.

  4. Data availability statement: Data sharing not applicable to this article as no datasets were generated or analysed during the current study.

References

[1] Alaoui B, Bari D, Ghabbar Y. Surface weather parameters forecasting using analog ensemble method over the main airports of Morocco. J Meteorol Res. 2022;36(6):866–81.10.1007/s13351-022-2019-0Suche in Google Scholar

[2] Yang J, Yagiz S, Liu YJ, Laouafa F. Comprehensive evaluation of machine learning algorithms applied to TBM performance prediction. Undergr Space. 2022;7(1):37–49.10.1016/j.undsp.2021.04.003Suche in Google Scholar

[3] Hasanpour R, Rostami J, Schmitt J, Ozcelik Y, Sohrabian B. Prediction of TBM jamming risk in squeezing grounds using Bayesian and artificial neural networks. J Rock Mech Geotech Eng. 2020;12:21–31.10.1016/j.jrmge.2019.04.006Suche in Google Scholar

[4] Liu L, Zhou W, Gutierrez M. Effectiveness of predicting tunneling-induced ground settlements using machine learning methods with small datasets. J Rock Mech Geotech Eng. 2022;14(4):1028–41.10.1016/j.jrmge.2021.08.018Suche in Google Scholar

[5] Xu H, Yan R. Research on sports action recognition system based on cluster regression and improved ISA deep network. J Intell Fuzzy Syst: Appl Eng Technol. 2020;39(4Pta2):5871–81.10.3233/JIFS-189062Suche in Google Scholar

[6] Sarabu A, Santra AK. Human action recognition in videos using convolution long short-term memory network with spatio-temporal networks. Emerg Sci J. 2021;5(1):25–33.10.28991/esj-2021-01254Suche in Google Scholar

[7] Muhammad K, Ullah A, Imran AS, Sajjad M, Kiran MS, Sannino G, et al. Human action recognition using attention-based LSTM network with dilated CNN features. Future Gener Comput Syst. 2021;125:820–30.10.1016/j.future.2021.06.045Suche in Google Scholar

[8] Xu S, Rao H, Peng H, Jiang X, Hu B. Attention based multi-level co-occurrence graph convolutional LSTM for 3D action recognition. IEEE Internet Things J. 2020;8(21):15990–6001.10.1109/JIOT.2020.3042986Suche in Google Scholar

[9] Zhang Z, Lv Z, Gan C, Zhu Q. Human action recognition using convolutional LSTM and fully-connected LSTM with different attentions. Neurocomputing. 2020;410:304–16.10.1016/j.neucom.2020.06.032Suche in Google Scholar

[10] Naveenkumar M, Domnic S. Learning representations from quadrilateral based geometric features for skeleton-based action recognition using LSTM networks. Intell Decis Technol. 2020;14(1):47–54.10.3233/IDT-190078Suche in Google Scholar

[11] Furnari A, Farinella G. Rolling-unrolling LSTMs for action anticipation from first-person video. IEEE Trans Pattern Anal Mach Intell. 2020;43(11):4021–36.10.1109/TPAMI.2020.2992889Suche in Google Scholar PubMed

[12] Yu M, Ning C, Xue Y. Brain medical image fusion scheme based on shuffled frog eaping algorithm and adaptive pulse oupled neural network. Image Process. 2020;6(15):1203–9.10.1049/ipr2.12092Suche in Google Scholar

[13] Li J, Zhang W, Diao W, Feng Y, Sun X, Fu K. CSF-Net: Color spectrum fusion network for semantic labeling of airborne laser scanning point cloud. IEEE J Sel Top Appl Earth Obs Remote Sens. 2022;15:339–52.10.1109/JSTARS.2021.3133602Suche in Google Scholar

[14] Feng T, Wang C, Zhang J, Wang B, Jin YF. An improved artificial bee colony-random forest (IABC-RF) model for predicting the tunnel deformation due to an adjacent foundation pit excavation. Undergr Space. 2022;7(4):514–27.10.1016/j.undsp.2021.11.004Suche in Google Scholar

[15] Singh S, Gupta D. Multistage multimodal medical image fusion model using feature‐adaptive pulse coupled neural network. Int J Imaging Syst Technol. 2020;31(2):981–1001.10.1002/ima.22507Suche in Google Scholar

[16] Wang L, Zhang J, Liu Y, Mi J, Zhang J. Multimodal medical image fusion based on Gabor representation combination of multi-CNN and fuzzy neural network. IEEE Access. 2021;9:67634–47.10.1109/ACCESS.2021.3075953Suche in Google Scholar

[17] Polinati S, Bavirisetti DP, Rajesh KN, Dhuli R. Multimodal medical image fusion based on content-based and PCA-sigmoid. Curr Med Imaging. 2022;18(5):546–62.10.2174/1573405617666211004114726Suche in Google Scholar PubMed

[18] Hou S, Liu Y, Yang Q. Real-time prediction of rock mass classification based on TBM operation big data and stacking technique of ensemble learning. J Rock Mech Geotech Eng. 2022;14(1):123–43.10.1016/j.jrmge.2021.05.004Suche in Google Scholar

[19] Wang D, Zhao H, Li Q. Medical brain image classification based on multi-feature fusion of convolutional neural network. J Intell Fuzzy Syst. 2020;38(1):127–37.10.3233/IFS-179387Suche in Google Scholar

Received: 2024-08-02
Revised: 2024-10-10
Accepted: 2024-10-16
Published Online: 2025-02-25

© 2025 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Artikel in diesem Heft

  1. Research Articles
  2. Generalized (ψ,φ)-contraction to investigate Volterra integral inclusions and fractal fractional PDEs in super-metric space with numerical experiments
  3. Solitons in ultrasound imaging: Exploring applications and enhancements via the Westervelt equation
  4. Stochastic improved Simpson for solving nonlinear fractional-order systems using product integration rules
  5. Exploring dynamical features like bifurcation assessment, sensitivity visualization, and solitary wave solutions of the integrable Akbota equation
  6. Research on surface defect detection method and optimization of paper-plastic composite bag based on improved combined segmentation algorithm
  7. Impact the sulphur content in Iraqi crude oil on the mechanical properties and corrosion behaviour of carbon steel in various types of API 5L pipelines and ASTM 106 grade B
  8. Unravelling quiescent optical solitons: An exploration of the complex Ginzburg–Landau equation with nonlinear chromatic dispersion and self-phase modulation
  9. Perturbation-iteration approach for fractional-order logistic differential equations
  10. Variational formulations for the Euler and Navier–Stokes systems in fluid mechanics and related models
  11. Rotor response to unbalanced load and system performance considering variable bearing profile
  12. DeepFowl: Disease prediction from chicken excreta images using deep learning
  13. Channel flow of Ellis fluid due to cilia motion
  14. A case study of fractional-order varicella virus model to nonlinear dynamics strategy for control and prevalence
  15. Multi-point estimation weldment recognition and estimation of pose with data-driven robotics design
  16. Analysis of Hall current and nonuniform heating effects on magneto-convection between vertically aligned plates under the influence of electric and magnetic fields
  17. A comparative study on residual power series method and differential transform method through the time-fractional telegraph equation
  18. Insights from the nonlinear Schrödinger–Hirota equation with chromatic dispersion: Dynamics in fiber–optic communication
  19. Mathematical analysis of Jeffrey ferrofluid on stretching surface with the Darcy–Forchheimer model
  20. Exploring the interaction between lump, stripe and double-stripe, and periodic wave solutions of the Konopelchenko–Dubrovsky–Kaup–Kupershmidt system
  21. Computational investigation of tuberculosis and HIV/AIDS co-infection in fuzzy environment
  22. Signature verification by geometry and image processing
  23. Theoretical and numerical approach for quantifying sensitivity to system parameters of nonlinear systems
  24. Chaotic behaviors, stability, and solitary wave propagations of M-fractional LWE equation in magneto-electro-elastic circular rod
  25. Dynamic analysis and optimization of syphilis spread: Simulations, integrating treatment and public health interventions
  26. Visco-thermoelastic rectangular plate under uniform loading: A study of deflection
  27. Threshold dynamics and optimal control of an epidemiological smoking model
  28. Numerical computational model for an unsteady hybrid nanofluid flow in a porous medium past an MHD rotating sheet
  29. Regression prediction model of fabric brightness based on light and shadow reconstruction of layered images
  30. Review Article
  31. Haar wavelet collocation method for existence and numerical solutions of fourth-order integro-differential equations with bounded coefficients
  32. Special Issue: Nonlinear Analysis and Design of Communication Networks for IoT Applications - Part II
  33. Silicon-based all-optical wavelength converter for on-chip optical interconnection
  34. Research on a path-tracking control system of unmanned rollers based on an optimization algorithm and real-time feedback
  35. Analysis of the sports action recognition model based on the LSTM recurrent neural network
  36. Industrial robot trajectory error compensation based on enhanced transfer convolutional neural networks
  37. Research on IoT network performance prediction model of power grid warehouse based on nonlinear GA-BP neural network
  38. Interactive recommendation of social network communication between cities based on GNN and user preferences
  39. Application of improved P-BEM in time varying channel prediction in 5G high-speed mobile communication system
  40. Construction of a BIM smart building collaborative design model combining the Internet of Things
  41. Optimizing malicious website prediction: An advanced XGBoost-based machine learning model
  42. Economic operation analysis of the power grid combining communication network and distributed optimization algorithm
  43. Sports video temporal action detection technology based on an improved MSST algorithm
  44. Internet of things data security and privacy protection based on improved federated learning
  45. Enterprise power emission reduction technology based on the LSTM–SVM model
  46. Construction of multi-style face models based on artistic image generation algorithms
  47. Special Issue: Decision and Control in Nonlinear Systems - Part II
  48. Animation video frame prediction based on ConvGRU fine-grained synthesis flow
  49. Application of GGNN inference propagation model for martial art intensity evaluation
  50. Benefit evaluation of building energy-saving renovation projects based on BWM weighting method
  51. Deep neural network application in real-time economic dispatch and frequency control of microgrids
  52. Real-time force/position control of soft growing robots: A data-driven model predictive approach
  53. Mechanical product design and manufacturing system based on CNN and server optimization algorithm
  54. Application of finite element analysis in the formal analysis of ancient architectural plaque section
  55. Research on territorial spatial planning based on data mining and geographic information visualization
  56. Fault diagnosis of agricultural sprinkler irrigation machinery equipment based on machine vision
  57. Closure technology of large span steel truss arch bridge with temporarily fixed edge supports
  58. Intelligent accounting question-answering robot based on a large language model and knowledge graph
  59. Analysis of manufacturing and retailer blockchain decision based on resource recyclability
  60. Flexible manufacturing workshop mechanical processing and product scheduling algorithm based on MES
  61. Exploration of indoor environment perception and design model based on virtual reality technology
  62. Tennis automatic ball-picking robot based on image object detection and positioning technology
  63. A new CNN deep learning model for computer-intelligent color matching
  64. Design of AR-based general computer technology experiment demonstration platform
  65. Indoor environment monitoring method based on the fusion of audio recognition and video patrol features
  66. Health condition prediction method of the computer numerical control machine tool parts by ensembling digital twins and improved LSTM networks
  67. Establishment of a green degree evaluation model for wall materials based on lifecycle
  68. Quantitative evaluation of college music teaching pronunciation based on nonlinear feature extraction
  69. Multi-index nonlinear robust virtual synchronous generator control method for microgrid inverters
  70. Manufacturing engineering production line scheduling management technology integrating availability constraints and heuristic rules
  71. Analysis of digital intelligent financial audit system based on improved BiLSTM neural network
  72. Attention community discovery model applied to complex network information analysis
  73. A neural collaborative filtering recommendation algorithm based on attention mechanism and contrastive learning
  74. Rehabilitation training method for motor dysfunction based on video stream matching
  75. Research on façade design for cold-region buildings based on artificial neural networks and parametric modeling techniques
  76. Intelligent implementation of muscle strain identification algorithm in Mi health exercise induced waist muscle strain
  77. Optimization design of urban rainwater and flood drainage system based on SWMM
  78. Improved GA for construction progress and cost management in construction projects
  79. Evaluation and prediction of SVM parameters in engineering cost based on random forest hybrid optimization
  80. Special Issue: Nonlinear Engineering’s significance in Materials Science
  81. Experimental research on the degradation of chemical industrial wastewater by combined hydrodynamic cavitation based on nonlinear dynamic model
  82. Study on low-cycle fatigue life of nickel-based superalloy GH4586 at various temperatures
  83. Some results of solutions to neutral stochastic functional operator-differential equations
  84. Ultrasonic cavitation did not occur in high-pressure CO2 liquid
  85. Research on the performance of a novel type of cemented filler material for coal mine opening and filling
  86. Testing of recycled fine aggregate concrete’s mechanical properties using recycled fine aggregate concrete and research on technology for highway construction
  87. A modified fuzzy TOPSIS approach for the condition assessment of existing bridges
  88. Nonlinear structural and vibration analysis of straddle monorail pantograph under random excitations
  89. Achieving high efficiency and stability in blue OLEDs: Role of wide-gap hosts and emitter interactions
  90. Construction of teaching quality evaluation model of online dance teaching course based on improved PSO-BPNN
  91. Enhanced electrical conductivity and electromagnetic shielding properties of multi-component polymer/graphite nanocomposites prepared by solid-state shear milling
  92. Optimization of thermal characteristics of buried composite phase-change energy storage walls based on nonlinear engineering methods
  93. A higher-performance big data-based movie recommendation system
  94. Nonlinear impact of minimum wage on labor employment in China
  95. Nonlinear comprehensive evaluation method based on information entropy and discrimination optimization
  96. Application of numerical calculation methods in stability analysis of pile foundation under complex foundation conditions
  97. Research on the contribution of shale gas development and utilization in Sichuan Province to carbon peak based on the PSA process
  98. Characteristics of tight oil reservoirs and their impact on seepage flow from a nonlinear engineering perspective
  99. Nonlinear deformation decomposition and mode identification of plane structures via orthogonal theory
  100. Numerical simulation of damage mechanism in rock with cracks impacted by self-excited pulsed jet based on SPH-FEM coupling method: The perspective of nonlinear engineering and materials science
  101. Cross-scale modeling and collaborative optimization of ethanol-catalyzed coupling to produce C4 olefins: Nonlinear modeling and collaborative optimization strategies
  102. Special Issue: Advances in Nonlinear Dynamics and Control
  103. Development of a cognitive blood glucose–insulin control strategy design for a nonlinear diabetic patient model
  104. Big data-based optimized model of building design in the context of rural revitalization
  105. Multi-UAV assisted air-to-ground data collection for ground sensors with unknown positions
  106. Design of urban and rural elderly care public areas integrating person-environment fit theory
  107. Application of lossless signal transmission technology in piano timbre recognition
  108. Application of improved GA in optimizing rural tourism routes
  109. Architectural animation generation system based on AL-GAN algorithm
  110. Advanced sentiment analysis in online shopping: Implementing LSTM models analyzing E-commerce user sentiments
  111. Intelligent recommendation algorithm for piano tracks based on the CNN model
  112. Visualization of large-scale user association feature data based on a nonlinear dimensionality reduction method
  113. Low-carbon economic optimization of microgrid clusters based on an energy interaction operation strategy
  114. Optimization effect of video data extraction and search based on Faster-RCNN hybrid model on intelligent information systems
  115. Construction of image segmentation system combining TC and swarm intelligence algorithm
  116. Particle swarm optimization and fuzzy C-means clustering algorithm for the adhesive layer defect detection
  117. Optimization of student learning status by instructional intervention decision-making techniques incorporating reinforcement learning
  118. Fuzzy model-based stabilization control and state estimation of nonlinear systems
  119. Optimization of distribution network scheduling based on BA and photovoltaic uncertainty
Heruntergeladen am 24.9.2025 von https://www.degruyterbrill.com/document/doi/10.1515/nleng-2024-0050/html
Button zum nach oben scrollen