Home Cross-modal multi-label image classification modeling and recognition based on nonlinear
Article Open Access

Cross-modal multi-label image classification modeling and recognition based on nonlinear

  • Shuping Yuan , Yang Chen , Chengqiong Ye , Mohammed Wasim Bhatt EMAIL logo , Mhalasakant Saradeshmukh and Md Shamim Hossain
Published/Copyright: January 24, 2023
Become an author with De Gruyter Brill

Abstract

Recently, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous work has concentrated on capturing label correlation but has neglected to correctly fuse picture features and label embeddings, which has a substantial influence on the model’s convergence efficiency and restricts future multi-label image recognition accuracy improvement. In order to better classify labeled training samples of corresponding categories in the field of image classification, a cross-modal multi-label image classification modeling and recognition method based on nonlinear is proposed. Multi-label classification models based on deep convolutional neural networks are constructed respectively. The visual classification model uses natural images and simple biomedical images with single labels to achieve heterogeneous transfer learning and homogeneous transfer learning, capturing the general features of the general field and the proprietary features of the biomedical field, while the text classification model uses the description text of simple biomedical images to achieve homogeneous transfer learning. The experimental results show that the multi-label classification model combining the two modes can obtain a hamming loss similar to the best performance of the evaluation task, and the macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. The cross-modal multi-label image classification algorithm can better alleviate the problem of overfitting in most classes and has better cross-modal retrieval performance. In addition, the effectiveness and rationality of the two cross-modal mapping techniques are verified.

1 Introduction

In a traditional single-label image classification system, a single image containing only a single category is annotated. In order to accurately identify this type of image, one must learn a classifier based on the known training data set and then use this classifier to classify the test images. The category of the test image must have appeared in the training phase [1]. Image classification has a wide range of uses. It may be used to differentiate between separate locations depending on their land use. Land-use information is frequently used in urban planning. High-resolution imagery is also used to examine the effects and damage caused by natural disasters such as floods, volcanic eruptions, and severe droughts [2,3]. In practice, training data and labeling information are often difficult to obtain. On the one hand, there are many kinds of things in the world, and they continue to increase. On the other hand, a certain category of things can be further subdivided into many subcategories, such as dogs can be subdivided into Tibetan kwai, pugs, huskies, etc. [4]. Therefore, the method of labeling all objects and then classifying them is inefficient and unrealistic. Due to the unequal information between image and label, the scalability of traditional single-label classification tasks is poor, and it is difficult to meet the actual requirements. The emergence of zero-sample learning (ZSL) solves the problem of missing tags to a certain extent. The aim is to mimic the ability of humans to identify new categories without having to see actual visual examples. Humans have this ability because they are able to make semantic connections between categories they have never seen and ones they have thought of. For example, if a child can recognize a horse and has never seen a zebra, he is told that a zebra is similar to a horse but with black and white stripes [5]. ZSL recognition in machines is predicated on the availability of something like a labeled training phase of seen classes as well as knowledge on how each unseen class is semantically linked to the seen classes [6,7]. Then, when a child sees a zebra, it has a better chance of accurately identifying it. Similarly, a zero-sample image classification system establishes a mapping relationship between the visual space and the semantic space through labeled training data, namely, defined categories, and assigns category labels to the test data according to the visual and semantic connections between the training data and the test data of unseen categories.

Early multi-label classification algorithms identify each item individually and divide the problem incorrectly into several classification model issues. Deep convolution neural network (CNN) has been developed for picture classification, and the precision of is doing image recognition techniques has been improved using CNN and its variants [8,9]. Deep neural networks (DNNs), specifically convolutional neural networks (CNNs), have been widely used in image text categorization since 2012, with impressive results. CNN’s medical image classification investigation yielded findings equivalent to those of a human expert [10,11]. Similar to the general domain multi-label classification task, we found that the medical image multi-label classification model is mainly faced with three challenges: unbalanced label distribution in the data set, too small scale of the annotated data set, and label dependence. First, when medical literature authors demonstrate a medical problem, they are used to combine semantically related images of various modes to generate composite images. Subgraphs of different sizes may be distributed in various positions, and the number of composite image instances corresponding to each label is not balanced [12]. Therefore, multi-label medical image classification is more complicated than single-label medical image classification. Second, a data set with a small scale is faced with a large scale of model parameters, which can easily lead to the problem of overfitting. Moreover, the cost of data set annotation is very large, and manual annotation of medical image data sets of millions of levels will cost a lot of manpower and material resources [13]. In this approach, multi-label classification is quite useful in medical data analysis. It covers topics such as diagnosis, surgery, biology, sickness progression, assessment, and teaching. Many patients, including those with eye illnesses, have many diseases that manifest themselves at the same time, both in the same function [14,15]. Multi-label classification, on the other hand, is by definition a tough task. This is due to the high dimensions, flatness, and imbalance of the data. Label difficulties include label dependency, location, interlabel diversity, and familiarity [16,17]. Zhang et al. proposed an unsupervised domain adapted image to video person re-recognition model through cross-model feature generation and target information retention Transmission Network (CMGTN). On the one hand, the design generator in our model can not only transform the unmarked sample function of the target domain into the feature space of the source domain but also retain the identity information of the target. On the other hand, we close the gap between pedestrian images and videos by embedding cross-model loss terms. In order to evaluate the performance of our approach, we conducted extensive experiments on the Prid-2011, ILIDs-VID, and MARS data sets and compared our approach to existing state-of-the-art IVPR models, including four unsupervised approaches and three supervised approaches. Experimental results show the effectiveness of our method [18].

By modeling the label dependencies, it has now become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous research has focused on capturing the link between labels, but has failed to effectively integrate the image. Features and label embeddings are reducing the model’s convergence efficiency and preventing it from becoming more exact. Image recognition with many labels is improving. Based on this, this article proposes a cross-modal multi-label image classification method based on nonlinear. A convolutional neural network based on mixed transfer learning is constructed to carry out multi-label classification. That is to say, heterogeneous transfer learning is used to overcome the overfitting problem caused by too-small labeled data sets, and homogeneous transfer learning is used to weaken the negative impact caused by unbalanced data sets.

The presented work achieves heterogeneous transfer learning and homogeneous transfer learning by using natural images and simple biomedical images with a single label, capturing the general features of the general field and the proprietary features of the biomedical field, whereas the text classification model achieves homogeneous transfer learning by using the description text of simple biomedical images. On the other hand, assuming that the current image is related to each tag, all tag association probabilities are predicted, and the prediction space is narrowed by the tag step calibration algorithm to finally determine the set of associated tags [19]. Experimental results show that the proposed method is suitable for the task of extracting subgraph pattern information from composite images of biomedical literature and can better alleviate the problem of overfitting and has good cross-modal retrieval performance. In addition, the effectiveness and rationality of cross-modal mapping technique are also verified. The aforementioned section is the introduction to the manuscript. Section 2 discusses the research methodology. Results analysis has been done in Section 3 of the article. Section 4 is the conclusion of the article which contains the crux of the whole paperwork.

2 Research methods

2.1 Cross-modal multi-label classification algorithm

By simulating the label dependencies, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous research has concentrated on capturing label correlation but has failed to adequately fuse picture features and label embeddings, which has a substantial influence on the model’s converge efficiency and restricts future number of co image recognition average accuracy. A single semantic label is usually insufficient to characterize multimedia data (images, text, etc.). Cross-modal retrieval systems previously relied on single-label multimodal data sources; however, several multi-label data sets with multiple modalities have recently been introduced. Multi-label datasets consider introducing a natural many-to-many interaction across different methods; i.e., each data object from one mode is closely linked to several other data points to the other method. Such correlations should be able to include this in any multi-label merge retrieval system that learns the common cross-modal subspace. Suppose X represents the instance space and L represents the q-dimensional label set. Given a set of training sets, formula (1) is shown:

(1) T = { ( x 1 , Y 1 ) , ( x 2 , Y 2 ) , , ( x n , Y n ) } ( x i X , Y i L ) .

The goal is to learn a multi-label classifier h : X 2 L . However, for convenience, learn a real numerical scoring function f : X × L R . Given an instance x i and a set of associated labels Y i , for labels y Y i , f ( x i , y ) a greater posterior probability should be output. In other words, this is true for y 1 Y i and y 2 Y i , f ( x i , y 1 ) > f ( x i , y 2 ) any sum. Using the scoring function f ( . , . ) , the multi-label classifier can be obtained as shown in formula (2):

(2) h ( x i ) = { y f ( x i , y ) > t , y L } .

Here, T can be a constant (for example, 0.5), or a function of inferring thresholds from the training set, which can divide the tag space pairs into related and unrelated tag sets.

2.2 Transfer learning model

The multi-label transfer learning model includes text and visual parts, as shown in Figure 1.

Figure 1 
                  Architecture of cross-modal multi-label classification model.
Figure 1

Architecture of cross-modal multi-label classification model.

2.2.1 Image model

At present, the advanced extreme deep convolutional neural network RESnet-50, which is a deep residual network with a depth of 50 layers, has achieved excellent results in the field of natural image recognition. The network was originally designed for multi-category classification tasks and required fine-tuning of the ResNet network structure to accommodate multi-label classification tasks. The binary cross-entropy loss function is replaced, and the last layer SoftMax is replaced with the Sigmoid activation function.

(3) sigmoid ( x ) = 1 1 + exp ( x ) .

It is used to estimate the relative posterior probability of each label. X is used to represent n samples in the training set, the learning rate is controlled by the Admax optimizer, the model is trained with 32 random images in a small batch, and the weight w is updated iteratively to minimize the loss function:

(4) L ( w , X ) = 1 n i = 1 n l ( f ( x i , w ) , y i ) ,

where x i is the ith sample in training set X. When the weight is w, the prediction probability vector of related categories of the output sample is denoted as; Is the true correlation category vector of the ith sample, which is represented by one-hot; Is the prediction category calculated by elements, instead of the penalty function, as shown in Eq. (5):

(5) l ( y i , y i ) = j = 1 q ( y i j log y i j + ( 1 y i j ) log ( 1 y i j ) ) ,

where y i j is the JTH member of the y i vector, representing the prediction correlation probability of the JTH category; y i j is a vector y i the JTH element that represents the JTH category and sample x i irrelevant or relevant; let’s call it 0 or 1. When training the image model, the method of mixed transfer learning is adopted: first, the heterogeneous transfer is used to learn natural images and massive image information, so as to alleviate the overfitting problem, and at the same time, the model can maintain sensitivity to the general features of the image in the general domain, such as color, texture, shape and so on. Specifically, the network resnet-50 is built with Keras, and the network weight published by the Keras author is loaded to obtain the pre-training network on the natural image data set ImageNet. Second, simple single-label homogeneous migration study biomedical images, due to the biomedical composite image containing different patterns, although training given label, there is not corresponding to a specific figure; therefore, by studying single label images, the image content and information associated with a tag are got, and the composite image data set label distribution imbalance caused by the negative impact is weakened. Specifically, the weight of most network layers of the pre-training network Res-Net-50 is fixed, and the single-mode medical Image data sets of image-CLEF2013 and ImageCLEF2016 are used to retrain the top-level fully connected layer. Finally, the two-step transfer learning network model is trained on multi-label medical image data, and multi-label classification is carried out to predict the relevant posterior probability of labels.

2.3 Nonlinear cross-modal label calibration algorithm

The given classifier outputs a test set instance x i ( 1 < i < m ) the posterior probability set of P = { p j p j R , 1 < j < q } using the threshold calibration function T; to get the set of predicted tags Y i = { y y L } , the formula of element Y is shown in Eq. (6):

(6) y 1 , p j t 0 , p i < t .

When labeling the current sample, labels higher than the threshold are added to the set of relevant labels according to the posterior probability output by the image model. The following two methods are usually used to select the threshold value: one is the fixed threshold value, the fixed threshold calibration method, which usually uses the popular threshold constant T = 0.5; the other is the dynamic threshold, which is determined by minimizing the difference between the training set and the test label base, as shown in formula (7):

(7) t = arg min t LCard ( X ) 1 m i = 1 m j = 1 q 1 p j > t .

Here, LCard ( X ) is the label cardinality, which is the most natural way to describe the attributes of multi-label data set, namely, the average number of labels per instance, as shown in formula (8):

(8) LCard ( X ) = 1 m i = 1 n Y i .

In this article, the cross-modal model is combined with global preference method and mean value method. Labels are calibrated according to the threshold function (fixed threshold 0.5) and the posterior probability of the output of the image model. If the relevant label set of a sample is empty, the posterior probability average of all labels output is calculated by the image model and text model, and the labels with the largest K average probability are taken as the relevant labels (e.g., K = 1).

3 Result analyses

3.1 The data set

Deep convolutional neural network-based multi-label classification models successfully fuse image representations with label co-occurrence embeddings, proving that the model’s convergence efficiency is significantly enhanced. Furthermore, picture recognition performance has improved as compared to prior systems. In most classes, the cross-modal multi-label image classification approach improves cross-modal retrieval efficiency and can reduce overfitting. The two cross-modal mapping systems’ efficiency and logic are also proven. In this study, the image Lef2016 multi-label classification task data set was used. The training set and test set contain 1,568 and 1,083 images, respectively, and provide corresponding explanatory text. This data set label adopts the category subset of the ImageCLEF2013 pattern recognition task, namely the remaining 30 classes after the composite image category (COMP) is removed. The class codes and class names are shown in Table 1.

Table 1

30 Class codes for multi-label classification

Class code The class name
DRUS Ultrasonic image
DRMB Magnetic resonance imaging
DRCT Computerized tomography
DRXR Radiography
DRAN Angiography
DRPE Positron emission computed tomography
DRCO Combined multi-mode image superposition
DVDM Dermatologic image
DVEN Endoscopic imaging
DVOR Images of other organs
DSEE Electroencephalogram (EEG)
DSEC Electrocardiogram (ECG)
DSEM electromyography
DMLI Optical microscope imaging
DMEL Electron microscope imaging
DMTR Transmission microscope imaging
DMFL Fluorescence microscope imaging
D3DR Three-dimensional decomposition
GTAB form
GPLI The program list
GFIG Statistical charts
GSCR Screen capture
GFLO The flow chart
GSYS System overview
GGEN Gene sequence map
GGEL Gel chromatography
GCHE Chemical structure diagram
GMAT A mathematical formula
GNCP Non-clinical photograph
GHDR Hand-drawn sketches

The single-label medical image data were derived from two other medical image processing tasks, namely the pattern classification task of ImageCLEF2013 and the subgraph pattern classification task of ImageCLEF2016. The former training set and test set contain 1,796 and 1,568 samples (excluding COMP mode), respectively, while the latter contain 6,676 and 4,166 samples.

3.2 Data preprocessing

When the image is loaded, its size is modified to 224 × 224, and the Keras pretreatment tool is used to convert it into a four-dimensional tensor. The channel-first mode is adopted, that is, (N, 3,224,224), where N is the number of instances. In order to improve the interpretability of word vectors, as many biomedical image captions as possible are collected [20]. In addition to the illustrations provided by the ImageCLEF2016 training set and test set, all illustrations were extracted from 300,000 medical literatures in ImageCLEF2013 [21]. After serialization, the Word2Vec tool is used to train the text to obtain the word vector dictionary, and the word vector quick reference table is constructed according to the description text of ImageCLEF2016, which is used as the weight of the embedded layer of the convolutional neural network, and updated synchronously in the process of network training.

3.3 Evaluation indicators

The evaluation indexes are divided into two types: case-based and tag-based, and two kinds of case-based evaluation indexes are selected. Suppose the test set is shown in formula (9):

(9) S = { ( x 1 , Y 1 ) , ( x 2 Y 2 ) , , ( x m , Y m ) } ( x i X , Y i L ) .

Hamming Loss (h-loss for short): Evaluating the misclassification times of instance label, the score ranges from 0 to 1, and 0 represent the best result. The calculation formula is shown in Eq. (10):

(10) h l o s s ( h ) = 1 m i = 1 m h ( x i ) Δ Y i L ,

where Δ represents symmetry difference. Mathematically, the symmetry difference of two sets is a set composed of elements that belong to only one set but not the other [22]. Macro average F1 value:F1 value represents the harmonic average of precision and recall rate. The larger the F1 value is, the more effective the classification method is. For example, when the accuracy is fixed, the larger the recall rate is, the larger the F value is, and vice versa [23]. We can test whether there is an unbalanced label distribution that may cause overfitting of some labels. F1Macro represents the arithmetic average of all label F1 values, and its calculation formula is shown in Eqs. (11)–(14):

(11) p ( h ) = 1 m i = 1 m Y i h ( x i ) h ( x i ) ,

(12) r ( h ) = 1 m i = 1 m Y i h ( x i ) Y i ,

(13) F 1 ( h ) = 2 × p ( h ) × r ( h ) p ( h ) + r ( h ) ,

(14) F 1 M a c r o = 1 q i = 1 q F 1 i , y i L .

3.4 Experimental results and discussion

3.4.1 Performance comparison of multi-label classification algorithms

The two best results of the ImageCLEF2016 multi-label classification task were selected as the comparative experimental method. They were both multi-label classification models based on image content. AlexNet was pre-trained on ImageNet, and the transfer learning was carried out to the current data set. According to the maximum score or maximum posteriori probability of SVM output, the label is calibrated as uniquely relevant label [24]. As shown in Table 2, the benchmark algorithm achieves good performance, with the lowest Hamming loss of 0.0131 and the highest macro average F1 value of 0.320.

Table 2

Performance comparison of ImageCLEF2016 multi-label classification algorithms

Methods 10FCV Test
H-Loss F1Macro H-Loss F1Macro
BMET MLC1 [11] 0.0131 0.295
BMET MLC2 [11] 0.0135 0.320
Hetero_TL_V 0.0281 0.171 0.0242 0.237
Hybrid_TL_V 0.0224 0.316 0.0160 0.482
No_TL_T 0.0365 0.082 0.0364 0.024
Homo_TL_T 0.0329 0.117 0.0239 0.185
Hybrid_TL_Cross-Modal 0.0224 0.333 0.0157 0.488

Compared with the most advanced algorithms in this field, hybrid_tl_cross-modality algorithm based on hybrid transfer learning in this article has a similar Hamming loss value and can accurately calibrate label information, and the hamming loss value is as low as 0.0157. The macro average F1 value has been increased by 52.5% to 0.488. In this article, heterogeneous transfer learning is adopted to learn the general characteristics of the general domain from the massive natural images, so as to alleviate the overfitting problem caused by the data scale being too small. Using a homogeneous transfer learning mechanism to learn more domain-specific features from single-label medical images can better alleviate the overfitting problem of some majority tags caused by unbalanced label distribution.

3.4.2 Cross-modal label calibration

The time complexity of cross-modal retrieval can be divided into two parts, that is, the computation time of similarity matrix and the sorting time of the retrieved data. Therefore, Figure 2 only compares the retrieval cost of Minimizing_LCard with th_0.5 on the test set. From Figure 2, it is clear that the proposed approach is great for extracting subgraph pattern information from composite images of the biomedical literature, and it can better ease the problem of overfitting and has high cross-modal retrieval performance

Figure 2 
                     Comparison of the retrieval cost of Minimizing_LCard and th_0.5 on the test set.
Figure 2

Comparison of the retrieval cost of Minimizing_LCard and th_0.5 on the test set.

Traditional methods were considered, and a comparative analysis has been done to justify the originality of the work. Many methods for investigating label dependencies have been investigated in order to reduce and enhance the label prediction space. Deep convolution architecture is utilized to learn the approximate top-k ranking objective function for multi-label picture recognition. In certain cases, CNN has been used with RNN to represent label dependencies sequentially by embedding semantic labels into vectors The fixed threshold method selects labels with a fixed threshold value of 0.5. If labels are calibrated according to the mean value of the prediction probability output by the two modal models, this calibration method can obtain multi-label classification performance with certain potential, namely, Hamming loss of 0.0161 and macro average F1 value of 0.470 (as shown in Table 3). As can be seen from Table 3, the label calibration method TH_0.5, which combines the globally preferred fixed threshold method and the highest mean probability method, is adopted in this article. Both the Hamming loss and the macro average F1 value are better than the fixed threshold method threshold_0.5, and the macro average F1 value is higher than all other methods.

Table 3

Performance comparison of threshold calibration algorithms

Methods 10FCV Test
H-Loss F1Macro H-Loss F1Macro
Minimizing_LCard 0.0267 0.348 0.0206 0.477
Threshold_0.5 0.0226 0.326 0.0161 0.470
Highest_Probability 0.0226 0.287 0.0150 0.438
TH_0.5 0.0224 0.333 0.0157 0.488

In Table 3, a comparative analysis has been done with various methods. This is clear that under various test methods, this method has proved better. The macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. In this article, the method of minimizing the difference between the predicted tag set and the training tag set (minimizing the difference between the tag cardinality) is used to dynamically determine the threshold. However, the Minimizing_LCard method (Table 3 method Minimizing_LCard) obtains the dynamic threshold of 0.296. According to the strategy adopted in this article, the Hamming loss and macro average F1 are 0.0206 and 0.477, respectively, which are not as good as the performance of TH_0.5. As shown in Table 3, the dynamic threshold method obtained a high macro mean F1 value of 0.477, showing certain potential, but the Hamming loss value was 0.0206, 31.2% different from the method in this article. To find out the reason, the label cardinality of the training set and the test set is checked, which are 1.46 and 1.25 respectively, and there are certain differences. Reducing the difference between the training set and the label cardinality and selecting the threshold value cannot give full play to the advantages of this dynamic threshold method in the current data set. The proposed label calibration method, TH_0.5, achieves a lower Hamming loss of 0.0157 and the highest macro average F1 value of 0.488.

4 Conclusions

In this article, the nature of the complex scene graph one by one more label image classification study, considering the lack of labels in the nature and artificial cost of high cost, this article introduced ZSL mechanism, namely around zero sample label image classification problems are discussed in more in-depth research and analysis and put forward the solution to the problem of the class a unified framework of learning. The corresponding algorithm is designed and improved for each module of the framework. This article proposes a cross-modal multi-label image classification method based on nonlinear, which can capture the pattern features from both the image content and the related description text. After fusing the two modal multi-label classification models, it is more effective than the existing methods to calibrate the biomedical mode labels. Experimental results show that the proposed method is suitable for the task of extracting subgraph pattern information from composite images of biomedical literature, can better alleviate the problem of overfitting, and has good cross-modal retrieval performance. In addition, the effectiveness and rationality of the cross-modal mapping technique are also verified. Moreover, the proposed approach is great for extracting subgraph pattern information from composite images of the biomedical literature, and it can better ease the problem of overfitting and has high cross-modal retrieval performance. Furthermore, picture identification performance has been enhanced when compared to state-of-the-art techniques. In the future, we will add the attention mechanism into our model to extract more accurate image attributes and increase image recognition performance.

  1. Funding information: The authors state no funding involved.

  2. Author contributions: All authors have accepted responsibility for the entire content of this manuscript and approved its submission.

  3. Conflict of interest: The authors state no conflict of interest.

  4. Data availability statement: The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

Reference

[1] Xiao, X, Yang J, Ning X. Research on multimodal emotion analysis algorithm based on deep learning. J Phys Conf Ser. 2021;1802(3):032054.10.1088/1742-6596/1802/3/032054Search in Google Scholar

[2] Chen Z, Cong B, Hua Z, Cengiz K, Shabaz M. Application of clustering algorithm in complex landscape farmland synthetic aperture radar image segmentation. J Intell Syst. 2021;30(1):1014–25. 10.1515/jisys-2021-0096.Search in Google Scholar

[3] Chaudhury S, Shelke N, Sau K, Prasanalakshmi B, Shabaz M. A novel approach to classifying breast cancer histopathology biopsy images using bilateral knowledge distillation and label smoothing regularization. Comput Math Methods Med. 2021;2021:4019358. 10.1155/2021/4019358.Search in Google Scholar PubMed PubMed Central

[4] Wang D, Mao K. Task-generic semantic convolutional neural network for web text-aided image classification. Neurocomputing. 2019;329(FEB.15):103–15.10.1016/j.neucom.2018.09.042Search in Google Scholar

[5] Liu Y, Xie Y, Yang J, Zuo X, Zhou B. Target classification and recognition for high-resolution remote sensing images: using the parallel cross-modal neural cognitive computing algorithm. IEEE Geosci Remote Sens Mag. 2020;8(3):50–62.10.1109/MGRS.2019.2949353Search in Google Scholar

[6] Jagota V, Luthra M, Bhola J, Sharma A, Shabaz M. A secure energy-aware game theory (SEGaT) mechanism for coordination in WSANs. Int J Swarm Intell Res. 2022;13(2):1–16. 10.4018/ijsir.287549.Search in Google Scholar

[7] Tang S, Shabaz M. A new face image recognition algorithm based on cerebellum-basal ganglia mechanism. J Healthc Eng. 2021:2021;3688881.10.1155/2021/3688881Search in Google Scholar PubMed PubMed Central

[8] Wang Y, Xie Y, Liu Y, Zhou K, Li X. Fast graph convolution network based multi-label image recognition via cross-modal fusion. Proceedings of the 29th ACM International Conference on Information & Knowledge Management; 2020 Oct 19–23; Online. ACM International, 2020. p. 1575–84.10.1145/3340531.3411880Search in Google Scholar

[9] Duan Y, Chen N, Zhang P, Kumar N, Chang L, Wen W. MS2GAH: Multi-label semantic supervised graph attention hashing for robust cross-modal retrieval. Pattern Recognit. 2022;128:108676.10.1016/j.patcog.2022.108676Search in Google Scholar

[10] Sharma A, Ansari MD, Kumar R. A comparative study of edge detectors in digital image processing. 2017 4th International Conference on Signal Processing, Computing and Control (ISPCC); 2017 Sep 21–23; Solan, India. IEEE; 2018. p. 246–50.10.1109/ISPCC.2017.8269683Search in Google Scholar

[11] Bhola J, Soni S. Information theory-based defense mechanism against DDOS attacks for WSAN. In: Harvey D, Kar H, Verma S, Bhadauria V, editors. Advances in VLSI, Communication, and Signal Processing. Lecture Notes in Electrical Engineering. Vol. 683. Singapore: Springer; 2021. 10.1007/978-981-15-6840-4_55.Search in Google Scholar

[12] Gu J, Liu B, Li X, Wang P, Wang B. Cross-modal representations in early visual and auditory cortices revealed by multi-voxel pattern analysis. Brain Imaging Behav. 2020;14(5):1908–20.10.1007/s11682-019-00135-2Search in Google Scholar PubMed

[13] Liu L, Zhang H, Zhou D. Clothing generation by multi-modal embedding: a compatibility matrix-regularized gan model. Image Vis Comput. 2021;107(8):104097.10.1016/j.imavis.2021.104097Search in Google Scholar

[14] Wang L, Sharma A. Analysis of sports video using image recognition of sportsmen. Int J Syst Assur Eng Manag. 2022;13:1–7.10.1007/s13198-021-01539-4Search in Google Scholar

[15] Zhang S, Srividya K, Kakaravada I, Karras DA, Jagota V, Hasan I, et al. A Global Optimization Algorithm for Intelligent Electromechanical Control System with Improved Filling Function. Sci Program. 2022;2022:3361027. 10.1155/2022/3361027.Search in Google Scholar

[16] Bhola J, Soni S, Cheema GK. Recent trends for security applications in wireless sensor networks – a technical review. 2019 6th International Conference on Computing for Sustainable Global Development (INDIACom); 2019 Mar 13–15; New Delhi, India. IEEE, 2020. p. 707–12Search in Google Scholar

[17] Chen J, Chen L, Shabaz M. Image Fusion Algorithm at Pixel Level Based on Edge Detection. In: Singh D, editor. Hindawi Limited; 2021. J Healthc Eng. 2021;2021:1–10. 10.1155/2021/5760660.Search in Google Scholar PubMed PubMed Central

[18] Zhang X, Li S, Jing XY, Ma F, Zhu C. Unsupervised domain adaption for image-to-video person re-identification. Multimed Tools Appl. 2020;79(45):33793–810.10.1007/s11042-019-08550-9Search in Google Scholar

[19] Huddar MG, Sannakki SS, Rajpurohit VS. Multi-level context extraction and attention-based contextual inter-modal fusion for multimodal sentiment analysis and emotion classification. Int J Multimed Inf Retr. 2020;9(2):103–12.10.1007/s13735-019-00185-8Search in Google Scholar

[20] Xu X, Li L, Sharma A. Controlling messy errors in virtual reconstruction of random sports image capture points for complex systems. Int J Syst Assur Eng Manag. 2021;1–8. 10.1007/s13198-021-01094-y.Search in Google Scholar

[21] Gala R, Budzillo A, Baftizadeh F, Miller J, Sümbül U. Consistent cross-modal identification of cortical neurons with coupled autoencoders. Nat Comput Sci. 2021;1(2):120–7.10.1038/s43588-021-00030-1Search in Google Scholar PubMed PubMed Central

[22] Li D, Wei X, Hong X, Gong Y. Infrared-visible cross-modal person re-identification with an X modality. Proceedings of the AAAI Conference on Artifficial Intelligence; 2020 Feb 7–12; New York (NY), USA. AAAI, 2020. p. 4610–7.10.1609/aaai.v34i04.5891Search in Google Scholar

[23] Chuanxu C, Sharma A. Improved CNN license plate image recognition based on shark odor optimization algorithm. Int J Syst Assur Eng Manag. 2021;1–8. 10.1007/s13198-021-01309-2.Search in Google Scholar

[24] Classen D, Siedt M, Nguyen KT, Ackermann J, Schaeffer A. Formation, classification and identification of non-extractable residues of 14C-labelled ionic compounds in soil. Chemosphere. 2019;232(OCT):164–70.10.1016/j.chemosphere.2019.05.038Search in Google Scholar PubMed

Received: 2022-03-01
Revised: 2022-04-14
Accepted: 2022-04-26
Published Online: 2023-01-24

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Research Articles
  2. The regularization of spectral methods for hyperbolic Volterra integrodifferential equations with fractional power elliptic operator
  3. Analytical and numerical study for the generalized q-deformed sinh-Gordon equation
  4. Dynamics and attitude control of space-based synthetic aperture radar
  5. A new optimal multistep optimal homotopy asymptotic method to solve nonlinear system of two biological species
  6. Dynamical aspects of transient electro-osmotic flow of Burgers' fluid with zeta potential in cylindrical tube
  7. Self-optimization examination system based on improved particle swarm optimization
  8. Overlapping grid SQLM for third-grade modified nanofluid flow deformed by porous stretchable/shrinkable Riga plate
  9. Research on indoor localization algorithm based on time unsynchronization
  10. Performance evaluation and optimization of fixture adapter for oil drilling top drives
  11. Nonlinear adaptive sliding mode control with application to quadcopters
  12. Numerical simulation of Burgers’ equations via quartic HB-spline DQM
  13. Bond performance between recycled concrete and steel bar after high temperature
  14. Deformable Laplace transform and its applications
  15. A comparative study for the numerical approximation of 1D and 2D hyperbolic telegraph equations with UAT and UAH tension B-spline DQM
  16. Numerical approximations of CNLS equations via UAH tension B-spline DQM
  17. Nonlinear numerical simulation of bond performance between recycled concrete and corroded steel bars
  18. An iterative approach using Sawi transform for fractional telegraph equation in diversified dimensions
  19. Investigation of magnetized convection for second-grade nanofluids via Prabhakar differentiation
  20. Influence of the blade size on the dynamic characteristic damage identification of wind turbine blades
  21. Cilia and electroosmosis induced double diffusive transport of hybrid nanofluids through microchannel and entropy analysis
  22. Semi-analytical approximation of time-fractional telegraph equation via natural transform in Caputo derivative
  23. Analytical solutions of fractional couple stress fluid flow for an engineering problem
  24. Simulations of fractional time-derivative against proportional time-delay for solving and investigating the generalized perturbed-KdV equation
  25. Pricing weather derivatives in an uncertain environment
  26. Variational principles for a double Rayleigh beam system undergoing vibrations and connected by a nonlinear Winkler–Pasternak elastic layer
  27. Novel soliton structures of truncated M-fractional (4+1)-dim Fokas wave model
  28. Safety decision analysis of collapse accident based on “accident tree–analytic hierarchy process”
  29. Derivation of septic B-spline function in n-dimensional to solve n-dimensional partial differential equations
  30. Development of a gray box system identification model to estimate the parameters affecting traffic accidents
  31. Homotopy analysis method for discrete quasi-reversibility mollification method of nonhomogeneous backward heat conduction problem
  32. New kink-periodic and convex–concave-periodic solutions to the modified regularized long wave equation by means of modified rational trigonometric–hyperbolic functions
  33. Explicit Chebyshev Petrov–Galerkin scheme for time-fractional fourth-order uniform Euler–Bernoulli pinned–pinned beam equation
  34. NASA DART mission: A preliminary mathematical dynamical model and its nonlinear circuit emulation
  35. Nonlinear dynamic responses of ballasted railway tracks using concrete sleepers incorporated with reinforced fibres and pre-treated crumb rubber
  36. Two-component excitation governance of giant wave clusters with the partially nonlocal nonlinearity
  37. Bifurcation analysis and control of the valve-controlled hydraulic cylinder system
  38. Engineering fault intelligent monitoring system based on Internet of Things and GIS
  39. Traveling wave solutions of the generalized scale-invariant analog of the KdV equation by tanh–coth method
  40. Electric vehicle wireless charging system for the foreign object detection with the inducted coil with magnetic field variation
  41. Dynamical structures of wave front to the fractional generalized equal width-Burgers model via two analytic schemes: Effects of parameters and fractionality
  42. Theoretical and numerical analysis of nonlinear Boussinesq equation under fractal fractional derivative
  43. Research on the artificial control method of the gas nuclei spectrum in the small-scale experimental pool under atmospheric pressure
  44. Mathematical analysis of the transmission dynamics of viral infection with effective control policies via fractional derivative
  45. On duality principles and related convex dual formulations suitable for local and global non-convex variational optimization
  46. Study on the breaking characteristics of glass-like brittle materials
  47. The construction and development of economic education model in universities based on the spatial Durbin model
  48. Homoclinic breather, periodic wave, lump solution, and M-shaped rational solutions for cold bosonic atoms in a zig-zag optical lattice
  49. Fractional insights into Zika virus transmission: Exploring preventive measures from a dynamical perspective
  50. Rapid Communication
  51. Influence of joint flexibility on buckling analysis of free–free beams
  52. Special Issue: Recent trends and emergence of technology in nonlinear engineering and its applications - Part II
  53. Research on optimization of crane fault predictive control system based on data mining
  54. Nonlinear computer image scene and target information extraction based on big data technology
  55. Nonlinear analysis and processing of software development data under Internet of things monitoring system
  56. Nonlinear remote monitoring system of manipulator based on network communication technology
  57. Nonlinear bridge deflection monitoring and prediction system based on network communication
  58. Cross-modal multi-label image classification modeling and recognition based on nonlinear
  59. Application of nonlinear clustering optimization algorithm in web data mining of cloud computing
  60. Optimization of information acquisition security of broadband carrier communication based on linear equation
  61. A review of tiger conservation studies using nonlinear trajectory: A telemetry data approach
  62. Multiwireless sensors for electrical measurement based on nonlinear improved data fusion algorithm
  63. Realization of optimization design of electromechanical integration PLC program system based on 3D model
  64. Research on nonlinear tracking and evaluation of sports 3D vision action
  65. Analysis of bridge vibration response for identification of bridge damage using BP neural network
  66. Numerical analysis of vibration response of elastic tube bundle of heat exchanger based on fluid structure coupling analysis
  67. Establishment of nonlinear network security situational awareness model based on random forest under the background of big data
  68. Research and implementation of non-linear management and monitoring system for classified information network
  69. Study of time-fractional delayed differential equations via new integral transform-based variation iteration technique
  70. Exhaustive study on post effect processing of 3D image based on nonlinear digital watermarking algorithm
  71. A versatile dynamic noise control framework based on computer simulation and modeling
  72. A novel hybrid ensemble convolutional neural network for face recognition by optimizing hyperparameters
  73. Numerical analysis of uneven settlement of highway subgrade based on nonlinear algorithm
  74. Experimental design and data analysis and optimization of mechanical condition diagnosis for transformer sets
  75. Special Issue: Reliable and Robust Fuzzy Logic Control System for Industry 4.0
  76. Framework for identifying network attacks through packet inspection using machine learning
  77. Convolutional neural network for UAV image processing and navigation in tree plantations based on deep learning
  78. Analysis of multimedia technology and mobile learning in English teaching in colleges and universities
  79. A deep learning-based mathematical modeling strategy for classifying musical genres in musical industry
  80. An effective framework to improve the managerial activities in global software development
  81. Simulation of three-dimensional temperature field in high-frequency welding based on nonlinear finite element method
  82. Multi-objective optimization model of transmission error of nonlinear dynamic load of double helical gears
  83. Fault diagnosis of electrical equipment based on virtual simulation technology
  84. Application of fractional-order nonlinear equations in coordinated control of multi-agent systems
  85. Research on railroad locomotive driving safety assistance technology based on electromechanical coupling analysis
  86. Risk assessment of computer network information using a proposed approach: Fuzzy hierarchical reasoning model based on scientific inversion parallel programming
  87. Special Issue: Dynamic Engineering and Control Methods for the Nonlinear Systems - Part I
  88. The application of iterative hard threshold algorithm based on nonlinear optimal compression sensing and electronic information technology in the field of automatic control
  89. Equilibrium stability of dynamic duopoly Cournot game under heterogeneous strategies, asymmetric information, and one-way R&D spillovers
  90. Mathematical prediction model construction of network packet loss rate and nonlinear mapping user experience under the Internet of Things
  91. Target recognition and detection system based on sensor and nonlinear machine vision fusion
  92. Risk analysis of bridge ship collision based on AIS data model and nonlinear finite element
  93. Video face target detection and tracking algorithm based on nonlinear sequence Monte Carlo filtering technique
  94. Adaptive fuzzy extended state observer for a class of nonlinear systems with output constraint
Downloaded on 8.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/nleng-2022-0194/html
Scroll to top button