Abstract
Recently, big data analytics have gained significant attention in healthcare industry due to generation of massive quantities of data in various forms such as electronic health records, sensors, medical imaging, and pharmaceutical details. However, the data gathered from various sources are intrinsically uncertain owing to noise, incompleteness, and inconsistency. The analysis of such huge data necessitates advanced analytical techniques using machine learning and computational intelligence for effective decision making. To handle data uncertainty in healthcare sector, this article presents a novel metaheuristic rough set-based feature selection with rule-based medical data classification (MRSFS-RMDC) technique on MapReduce framework. The proposed MRSFS-RMDC technique designs a butterfly optimization algorithm for minimal rough set selection. In addition, Hadoop MapReduce is applied to process massive quantity of data. Moreover, a rule-based classification approach named Repeated Incremental Pruning for Error Reduction (RIPPER) is used with the inclusion of a set of conditional rules. The RIPPER will scale in a linear way with the number of training records utilized and is suitable to build models with data uncertainty. The proposed MRSFS-RMDC technique is validated using benchmark dataset and the results are inspected under varying aspects. The experimental results highlighted the supremacy of the MRSFS-RMDC technique over the recent state of art methods in terms of different performance measures. The proposed methodology has achieved a higher F-score of 96.49%.
1 Introduction
Big data analytics (BDA) becomes a hot research area among research communications and finds its applicability in several application areas encompassing healthcare, business, and industrial sector [1]. BDA refers to the application of artificial intelligence (AI) techniques to much massive, heterogeneous huge datasets that comprise processed, semi-structured, and unstructured information from many resources with sizes ranging from terabytes to zettabytes. It is a phrase used to describe data sets that are too large or too complex for typical database systems to gather, maintain, and analyze with less delay. Big data analysis enables investigators, academics, and enterprise customers to generate faster and more effective choices utilizing unprecedented or unsuitable information. The large quantity of data generated at maximum velocity in healthcare poses a challenging issue. It results in repetitive data, which leads to being expensive and consumes more time. At the same time, the massive quantity of data from disease diagnosis, meaningful data inspection, prediction, and optimization approaches offer insights into healthcare applications [2]. Hence, the healthcare association is finding an effective information technology (IT) artifact which authorized to consolidate authoritative resources to carry over a high-quality patient involvement [3]. Recently, MapReduce-based BDA techniques have been developed to handle massive quantities of data [4,5]. The advantages of BDA are enhanced customer satisfaction, quality growth and creativity, complicated source connections, risk administration, faster and good decision making, and so on.
Behavior patterns are sometimes known as chain of activities, which emphasizes their origin as a complicated connection of shorter sections of behavior. They can be produced by behavior modification of several parts delivered in the proper sequence, usually known as a pattern of behavior. In the business sector, the essential worth of large data is proficiently used to identify the behavioral pattern of the consumer to design novel business services and solutions [6]. In the medical field, the implication of big data acts as a prediction model and machine learning (ML) model to provide useful solutions to implement treatment plans and customized medical care. Owing to the advancements of BDA and the combined technologies, the healthcare sector has identified pragmatic transformation at distinct levels from the perspective of existing stakeholders [7]. The influence of big data in medical field leads to detect new data resources such as social networking, telematics, and wearables, to analyze the legacy resources which comprise patient’s data, diagnosis, clinical trial data, and so on. If the data sources and analysis are integrated, it offers a valued source of data for medical community in achieving effective healthcare solutions [8].
Although BDA using AI is useful, an extensive challenging issue is data uncertainty [9]. For example, every V feature includes diverse resources of uncertainty, like unstructured, incomplete, or noisy data. In addition, it can be included in the whole analytic process such as collection, organization, and investigation of big data [10,11]. Moreover, the ML model might not result in optimum outcomes when the training dataset is biased. Previous studies [12,13] presented six major challenging issues in BDA, comprising data uncertainty. They have concentrated majorly on the way the uncertainty affects the efficiency of learning from big data, while a distinct concern exists in the mitigation of uncertainty inheritance in the huge dataset. They generally exist in data mining as well as ML techniques. Thus, resolving uncertainty in BDA should be at the front of any automated solutions, since uncertainty has major impact on the overall accuracy.
This study focuses on the design of metaheuristic rough set-based feature selection with rule-based medical data classification (MRSFS-RMDC) technique on MapReduce framework. It is utilized to reduce the number of semantic similarity selections and RIPPER-based classifications. It is utilized in MapReduce environments to manage large amounts of information. The proposed MRSFS-RMDC technique designs a butterfly optimization algorithm (BOA) for minimal rough set selection. The butterfly (BF) optimization method is a revolutionary population-based swarm intelligence system that replicates the foraging behavior of BFs. BOA has been used in a variety of disciplines. It mimics the grazing activity of insects. Enzymatically, every insect has sensory receptors all throughout its body. These receptors are known as chemoreceptors, and they are responsible for inhaling the aroma of foodstuff and flowers. Also, Hadoop MapReduce is applied to process massive quantity of data. Furthermore, a rule-based classification approach named Repeated Incremental Pruning for Error Reduction (RIPPER) is used with the inclusion of a set of conditional rules. For inspecting the enhanced performance of the MRSFS-RMDC technique, a wide range of simulations take place on benchmark PIMA Indians diabetes dataset, and the results are examined under varying aspects.
In this article, the information based on BDA and some other techniques are given in Section 1. Section 2 shows the related work which are related to the BDA process and neural networks (NNs). The proposed methodology is explained in Section 3. The analyzed techniques are experimentally validated in Section 4. Finally, the efficiency and analysis process are summarized in the conclusion part, which is illustrated in Section 5.
2 Related works
In the study of Wang and He [14] and Ali et al. [15], an intelligent healthcare model is presented to predict heart disease by the use of ensemble DL and feature fusion models. Fusion centers undertake analytics and promote data exchange, supporting federal authorities and border patrol authorities in the prevention, detection, and response to extremism and violence. First, the feature fusion model integrates the derived attributes from sensors and electronic health records to produce meaningful healthcare data. Second, the information gain (IG) approach removes the irrelated and repetitive attributes and chooses the essential ones that reduce the computation complexity and improve the system efficiency. Moreover, the conditional probability method determines a particular feature weight for all classes that additionally enhance the efficiency of the system. Several possible uses of probability density functions include calculating likelihood function for variables and crucial areas for theory testing. It is frequently beneficial to construct a plausible distribution of income theory for discrete variables. At last, an ensemble DL model undergoes training to predict heart disease.
Ramani et al. [16] presented an improved artificial neural network (ANN) classification model on a MapReduce model to predict diabetes. Initially, min–max normalization approach is involved to pre-process the healthcare data, and the MapReduce is utilized to offer an effective model in the predictive programming algorithms for the map and reduce functions. It is an easier programming interface that assists to solve the prediction problem. Chrimes et al. [17] established a novel BDA that is effectively designed in the Hadoop/MapReduce technology developed in the HBase environment and created hospital-oriented metadata at high volume. Generally, the model over generated HBase data files takes a week or a month for 1 billion (10TB) and 3 billion (30TB), respectively. Furthermore, the evaluation test results from the patient data with Apache tools in Hadoop ecosystem.
Selvi and Muthulakshmi [18] developed an effective map reduce-based optimal data classifier technique for proficiently diagnosing diabetes. It encompasses Hadoop tool, data collection, and classification using gradient boosting tree (GBT). For enhancing the classification performance of the GBT, an improved k-means clustering method is combined together. After obtaining training dataset, the k-means clustering is used, which is a sort of unsupervised learning (for instance, information in the absence of characterized classifications and sets). The purpose of this method is to locate relationships among data, with its parameter k representing the number of communities found. The clustering of sample points is dependent on characteristic resemblance. AlZubi [19] introduced an effective big data classification model such as proficient MapReduce technology to identify diabetes. Primarily, the data are gathered from a massive dataset and the MapReduce concept is employed for composing the smaller chunks of data proficiently. In line with this, data normalization is carried out and the features are chosen by the ant bee colony (ABC) algorithm. The ABC method is an optimization approach which replicates honey bee feeding behavior and has been effectively utilized in a variety of real scenarios. The percentage of working or observer honeybees in the swarm is equivalent to the total of remedies in the colony. Finally, the elected features are processed by the use of support vector machine (SVM) with multi-layer NN.
SVMs are a type of learning algorithms used for categorization, prediction, and anomaly analysis. It is also easy to learn and understand since it employs a selection of training images in the classification model (named training set). It works best when there is a decent margin of distance. It works well in three-dimensional areas. It works well when the dimensionality is more than the amount of data.
Syed et al. [20] offered an effective smart healthcare model for ambient-assisted living (AAL) for monitoring the physical actions of old people by the use of sensors and ML models for rapid examination and recommendation. AAL is described as the application of data as well as communication technique (ICT) in a human’s everyday housing and living surroundings to help individuals to keep busy more, maintain talkative, as well as lead a normal life beyond older years. Primarily, wearables are used for data collection and sent to the cloud and data analytics layer. For managing large quantity of data in a simultaneous way, Hadoop MapReduce tool is employed. Finally, the multi-nominal Naïve Bayes classification model fitted into the MapReduce tool is utilized for the motion classification process.
3 The proposed model
In this study, a new MRSFS-RMDC technique has been developed for medical data classification with data uncertainty. The proposed MRSFS-RMDC technique encompasses preprocessing BOA-based minimal rough set selection and RIPPER-based classification. In addition, the MRSFS-RMDC technique is executed in the MapReduce environment to handle big data. Figure 1 demonstrates the overall block diagram of proposed MRSFS-RMDC model. The detailed working of these three modules is offered in the succeeding sections.

Overall process of MRSFS-RMDC model.
3.1 Preprocessing
At the initial stage, the data preprocessing takes place to transform the data into a compatible format. Primarily, the pre-processed step was implemented to change non-traditional into traditional datasets that improve the accuracy of presented approach. Here, the min–max normalization manner was applied. Among the most prevalent methods for normalizing information is min–max normalization. For each attribute, the significance level is converted to a 0, the highest value is converted to a 1, and all other values are converted to a fraction within 0 and 1. In this approach, the feature is being rescaled to the range of
where
3.2 Design of BOA-based minimal rough set selection
Once the medical data are preprocessed, the feature selection process is carried out by the use of BOA-based minimal rough set selection. The BOA has been metaheuristic technique, which is simulated as foraging and mating nature of BFs. An essential feature of BOA from other metaheuristics is that all BFs hold their individual scent. The fragrance is represented using the following equation:
where
where
where
where
The concept of BOA can be used for the minimal rough set selection issue. Assume a huge feature space with entire feature subsets. Every feature subset can be considered as a point or location in the space. When a total of
Initially, the BF position is considered as the binary bit string of length
The Hamming distance is a statistic that may be used to compare two binary value sequences. When two discrete sequences of similar duration are compared, the Hamming distance is the amount of bit locations where the data pairs disagree. d is indicated as hamming distance and it can be illustrated as d(a,b). For every binary bit string
where
where
Every BF begins with an arbitrary position in each round [22]. Every BF aims to modify each step in the searching space based on the behavior of searching, swarming, and following. A fitness function can be derived based on the three behaviors and the one with higher fitness value can be chosen for updating the succeeding position. It can be represented as follows:
where
Once the BF reached a maximum fitness value, it is perished with getting a rough set select. It indicates that the BF constructs the local optimum solution. The succeeding round beings when every BF gets perished. The termination condition is set as the maximum number of iterations or attaining an identical set of feature selects under three succeeding rounds. [23]
3.3 Design of RIPPER in MapReduce environment
During the classification process, the RIPPER technique gets executed on the MapReduce platform to classify the healthcare data. MapReduce performs two important capabilities as it classifies and distributes activity to different devices in the system or mapping, a service known as the mapper, and it organizes and combines the data for every server into a coherent response to a question, known as the reduction. The RIPPER has been extremely utilized as a rule induction technique. It scales linearly with the training sample count utilized and is suitable to structure techniques with data uncertainty. Also, it utilizes a validation set (VS) for preventing the technique from overfitting. The RIPPER orders the classes based on the frequency. When
In order to generate rules, RIPPER utilizes an approach which primarily considers all rules are empty and afterward it can be constructed with more conjuncts to it consecutively. It utilizes FOIL IG for adding conjunct to rule. Assume that rule
The conjunct is increasing till the rule begins covering negative instances. The rule was pruned dependent upon their efficiency on the VS utilizing the subsequent metric
After creating the rule, every record under the rule is removed. The minimum description length (MDL) concept is a strong inductive assessment approach that serves as the foundation for data analysis, analytical thinking, and computer vision. It asserts that the optimal interpretation, provided a restricted collection of observable facts, is the one which allows for the most quantization. This technique then continues with constructing a novel rule. The rule has been made if the rule set does not violate the MDL rule and the error on the VS was lesser than 50%.
RIPPER can be applied in Java using Hadoop Java library. The dataset has been separated horizontally for supporting the Hadoop MapReduce structure and making sure parallel implementation of code. The mapper–reducer is used for three purposes: one in one to rule generating, rule pruning, and computing accuracy [20]. Therefore, all the mappers implement their code on some of the datasets, and the reducer aggregates on the outcome of mapper for producing one general output. Figure 2 illustrates the framework of MapReduce.

Structure of MapReduce.
In order to rule generating, the mapper–reducer function computes the values of
3.4 Rule growing phase
The rule has been adjusted as an empty rule, for instance, it covers every record. Then, the conjunct is added further one by one to the rule. The conjuncts with the value of FOIL’s IG measure have been chosen more. The parameter of measure was computed utilizing MapReduce functions, and the <key, value> pairs have the values of
3.5 Rule pruning phase
The rule created in one is before pruned utilizing
3.6 Model evaluation phase
Then the rule set is created, and the rules have been utilized for classifying the test record. The MapReduce purpose is named for classifying the record and computing the accuracy of this technique. The <key, value> pairs comprise the value of entire positive and negative records supporting the method. The rule set is returned, and the correctness of the rules and the rule set on the test record are assessed.
4 Experimental validation
The performance validation of the MRSFS-RMDC technique takes place using PIMA Indians diabetes dataset from Kaggle repository [24,25,26,27]. Customers may use Kaggle to search and post statistical models, study and construct models in a web-based network infrastructure, collaborate with other data professionals as well as supervised learning experts, and compete to accomplish ML tasks. The dataset comprises 768 samples with 8 attributes and 2 classes namely positive/negative. The results are examined under varying sizes of data. Table 1 provides the sensitivity and specificity analysis of the MRSFS-RMDC technique under varying data sizes in GB.
Result analysis of MRSFS-RMDC model with different data sizes
Data size (GB) | DNN | SVM | DCD-ANN | MRSFS-RMDC |
---|---|---|---|---|
Sensitivity (%) | ||||
2 | 89.03 | 91.81 | 93.34 | 94.92 |
4 | 89.77 | 92.27 | 93.71 | 95.49 |
6 | 90.00 | 92.70 | 94.77 | 96.18 |
8 | 90.95 | 93.02 | 95.95 | 97.27 |
10 | 91.81 | 94.37 | 96.93 | 97.96 |
Specificity (%) | ||||
2 | 81.00 | 80.46 | 82.68 | 84.91 |
4 | 81.72 | 80.76 | 83.40 | 85.21 |
6 | 82.56 | 80.94 | 84.06 | 86.17 |
8 | 82.98 | 80.88 | 84.73 | 87.13 |
10 | 84.49 | 81.18 | 86.89 | 90.01 |
Figure 3 illustrates the sensitivity analysis of the MRSFS-RMDC technique with existing ones under distinct sizes of dataset. The results show that the MRSFS-RMDC technique has accomplished effective outcomes with the maximum sensitivity values. For instance, with 2 GB data, the MRSFS-RMDC technique has attained an increased sensitivity of 94.92%, whereas the DNN, SVM, and DCD-ANN techniques have obtained reduced sensitivity of 89.03, 91.81, and 93.34%, respectively. At the same time, with 10 GB data, the MRSFS-RMDC technique has achieved a higher sensitivity of 97.96%, whereas the DNN, SVM, and DCD-ANN techniques have accomplished lower sensitivity of 91.81, 94.37, and 96.93%, respectively.

Sensitivity analysis of MRSFS-RMDC model with distinct data size.
Figure 4 shows the specificity analysis of the MRSFS-RMDC system with existing ones under distinct sizes of dataset. The outcomes demonstrated that the MRSFS-RMDC approach has accomplished effective outcomes with the maximum specificity values. For example, with 2 GB data, the MRSFS-RMDC manner has attained an increased specificity of 84.91%, whereas the DNN, SVM, and DCD-ANN techniques have obtained reduced specificity of 81, 80.46, and 82.68%, respectively. Also, with 10 GB data, the MRSFS-RMDC approach has reached an increased specificity of 90.01%, whereas the DNN, SVM, and DCD-ANN methodologies have accomplished minimum specificity of 84.49, 81.18, and 86.89%, respectively.

Specificity analysis of MRSFS-RMDC model with distinct data size.
Table 2 and Figure 5 depict the precision analysis of the MRSFS-RMDC approach with existing ones in varying sizes of dataset. The results portrayed that the MRSFS-RMDC methodology has accomplished effective outcomes with the maximal precision values. For example, with 2 GB data, the MRSFS-RMDC manner has attained an enhanced precision of 85.09%, whereas the DNN, SVM, and DCD-ANN techniques have obtained reduced precision of 82.04, 81.37, and 83.71%, respectively. Simultaneously, with 10 GB data, the MRSFS-RMDC algorithm has obtained a maximum precision of 90.81%, whereas the DNN, SVM, and DCD-ANN approaches have accomplished lower precision of 86.62, 84.19, and 88.90%, respectively.
Precision analysis of MRSFS-RMDC model with varying data size
Precision (%) | ||||
---|---|---|---|---|
Data size (GB) | DNN | SVM | DCD-ANN | MRSFS-RMDC |
2 | 82.04 | 81.37 | 83.71 | 85.09 |
4 | 83.38 | 81.80 | 83.95 | 85.66 |
6 | 84.47 | 82.09 | 85.47 | 86.85 |
8 | 84.71 | 83.38 | 86.76 | 88.24 |
10 | 86.62 | 84.19 | 88.90 | 90.81 |

Precision analysis of MRSFS-RMDC model with distinct data size.
Table 3 offers the accuracy and F-score analysis of the MRSFS-RMDC approach under varying data sizes in GB. Figure 6 demonstrates the accuracy analysis of the MRSFS-RMDC method with existing ones under varying sizes of dataset. The outcome outperformed that the MRSFS-RMDC algorithm has accomplished effectual results with the maximal accuracy values. For example, with 2 GB data, the MRSFS-RMDC approach has gained a superior accuracy of 99.12%, whereas the DNN, SVM, and DCD-ANN systems have achieved decreased accuracy of 95.55, 92.22, and 97.92%, respectively. Along with that, with 10 GB data, the MRSFS-RMDC methodology has achieved an improved accuracy of 99.89%, whereas the DNN, SVM, and DCD-ANN techniques have accomplished reduced accuracy of 97, 94.64, and 99.75%, respectively.
Comparative analysis of MRSFS-RMDC model in terms of accuracy and F-score
Data size (GB) | DNN | SVM | DCD-ANN | MRSFS-RMDC |
---|---|---|---|---|
Accuracy (%) | ||||
2 | 95.55 | 92.22 | 97.92 | 99.12 |
4 | 95.84 | 92.32 | 97.00 | 98.45 |
6 | 96.18 | 93.96 | 97.58 | 98.83 |
8 | 96.47 | 94.11 | 98.01 | 99.36 |
10 | 97.00 | 94.64 | 99.75 | 99.89 |
F -Score (%) | ||||
2 | 83.77 | 82.86 | 93.66 | 95.29 |
4 | 84.16 | 83.20 | 93.75 | 95.14 |
6 | 85.55 | 84.01 | 93.90 | 95.58 |
8 | 86.37 | 84.59 | 94.14 | 95.77 |
10 | 88.24 | 85.74 | 94.62 | 96.49 |

Accuracy analysis of MRSFS-RMDC model with distinct data size.
Figure 7 shows the F-score analysis of the MRSFS-RMDC technique with existing ones under different sizes of dataset. The outcomes exhibited that the MRSFS-RMDC technique has accomplished effective outcomes with the higher F-score values. The F-score or F-measure is an estimate of a test’s efficiency in descriptive statistics of binary categorization. An F-score can have a maximum benefit of 1.0, signifying flawless accuracy or recollection, and a minimum value of 0 if the accuracy or the recollection is 0. For example, with 2 GB data, the MRSFS-RMDC algorithm has reached an enhanced F-score of 95.29%, whereas the DNN, SVM, and DCD-ANN techniques have obtained minimal F-scores of 83.77, 82.86, and 93.66%, respectively. Besides, with 10 GB data, the MRSFS-RMDC methodology has achieved a higher F-score of 96.49%, whereas the DNN, SVM, and DCD-ANN manners have accomplished lower F-scores of 88.24, 85.74, and 94.62%, respectively.

F-score analysis of MRSFS-RMDC model with distinct data size.
By looking into the aforementioned tables and figures, it can be ensured that the MRSFS-RMDC technique is found to be an effective tool to medical data classification.
5 Conclusion
In this research, a new MRSFS-RMDC technique has been developed for medical data classification with data uncertainty. The proposed MRSFS-RMDC technique encompasses preprocessing BOA-based minimal rough set selection and RIPPER-based classification. In addition, the MRSFS-RMDC technique is executed in the MapReduce environment to handle big data. The design of BOA technique and RIPPER helps to handle data uncertainty in medical data classification. For examining the improved performance of the MRSFS-RMDC technique, a wide range of simulations take place on benchmark PIMA Indians diabetes dataset and the results are inspected under varying aspects. The experimental results showcased that the MRSFS-RMDC technique has accomplished effectual outcomes over the other recent approaches in terms of different performance measures. In future, the medical data classification performance of the MRSFS-RMDC technique can be boosted by the inclusion of clustering and outlier detection approaches.
-
Conflict of interest: The author declares no conflict of interest.
References
[1] Hariri RH, Fredericks EM, Bowers KM. Uncertainty in big data analytics: survey, opportunities, and challenges. J Big Data. 2019;6(1):1–16.10.1186/s40537-019-0206-3Search in Google Scholar
[2] Rahini S. Large scale optimization to minimize network traffic using MapReduce in big data applications. International Conference on Computation of Power, Energy Information and Communication (ICCPEIC); April 2016. p. 193–9.10.1109/ICCPEIC.2016.7557196Search in Google Scholar
[3] Ma C, Zhang HH, Wang X. Machine learning for big data analytics in plants. Trends Plant Sci. 2014;19(12):798–808.10.1016/j.tplants.2014.08.004Search in Google Scholar PubMed
[4] Kumar S, Kumar-Solanki V, Choudhary SK, Selamat A, Gonzalez-Crespo R. Comparative study on ant colony optimization (ACO) and K-means clustering approaches for jobs scheduling and energy optimization model in internet of things (IoT). Int J Interact Multimed Artif Intell. 2020;6(1):107.10.9781/ijimai.2020.01.003Search in Google Scholar
[5] Zhou L, Pan S, Wang J, Vasilakos AV. Machine learning on big data: opportunities and challenges. Neurocomputing. 2017;237:350–61.10.1016/j.neucom.2017.01.026Search in Google Scholar
[6] Wang L, Alexander CA. Big data in medical applications and health care. Am Med J. 2015;6:1–8.10.3844/amjsp.2015.1.8Search in Google Scholar
[7] Paulraj D. An automated exploring and learning model for data prediction using balanced CA-Svm. J Ambient Intell Humanized Comput. 2020;Springer 1–12. ISSN 1868-5137 (online), Published Online: April 2020.10.1007/s12652-020-01937-9Search in Google Scholar
[8] Tsai CW, Lai CF, Chao HC, Vasilakos AV. Big data analytics: a survey. J Big Data. 2015;2(1):21.10.1186/s40537-015-0030-3Search in Google Scholar
[9] Neelakandan S, Berlin MA, Tripathi S, Devi VB, Bhardwaj I, Arulkumar N. IoT-based traffic prediction and traffic signal control system for smart city. Soft Comput. 2021;25:12241–48. 10.1007/s00500-021-05896-x.Search in Google Scholar
[10] Palanisamy V, Thirunavukarasu R. Implications of big data analytics in developing healthcare frameworks–A review. J King Saud Univ-Computer Inf Sci. 2019;31(4):415–25.10.1016/j.jksuci.2017.12.007Search in Google Scholar
[11] Slagter K, Hsu CH, Chung YC, Zhang D. An improved partitioning mechanism for optimizing massive data analysis using MapReduce. J Supercomputing. 2013;66(1):539–55.10.1007/s11227-013-0924-9Search in Google Scholar
[12] Dineshkumar M. Decentralized access control of data in cloud services using key policy attribute based encryption. Int J Sci Res Dev. APRIL 2015;3(2):2016–20. ISSN 2321-0613.Search in Google Scholar
[13] Chen M, Li Y, Zhang Z, Hsu CH, Wang S. Real-time, large-scale duplicate image detection method based on multi-feature fusion. J Real-Time Image Process. 2016;13(3):557–70.10.1007/s11554-016-0632-9Search in Google Scholar
[14] Wang X, He Y. Learning from uncertainty for big data: future analytical challenges and strategies. IEEE Syst Man Cybern Mag. 2016;2(2):26–31.10.1109/MSMC.2016.2557479Search in Google Scholar
[15] Ali F, El-Sappagh S, Islam SR, Kwak D, Ali A, Imran M, et al. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Inf Fusion. 2020;63:208–22.10.1016/j.inffus.2020.06.008Search in Google Scholar
[16] Ramani R, Devi KV, Soundar KR. MapReduce-based big data framework using modified artificial neural network classifier for diabetic chronic disease prediction. Soft Comput. 2020;24(21):16335–45.10.1007/s00500-020-04943-3Search in Google Scholar
[17] Chrimes D, Zamani H, Moa B, Kuo A. Simulations of Hadoop/MapReduce-based platform to support its usability of big data analytics in healthcare.Search in Google Scholar
[18] Selvi RT, Muthulakshmi I. Modelling the map reduce based optimal gradient boosted tree classification algorithm for diabetes mellitus diagnosis system. J Ambient Intell Humanized Comput. 2021;12(2):1717–30.10.1007/s12652-020-02242-1Search in Google Scholar
[19] AlZubi AA. Big data analytic diabetics using map reduce and classification techniques. J Supercomputing. 2020;76(6):4328–37.10.1007/s11227-018-2362-1Search in Google Scholar
[20] Syed L, Jabeen S, Manimala S, Alsaeedi A. Smart healthcare framework for ambient assisted living using IoMT and big data analytics techniques. Future Gener Computer Syst. 2019;101:136–51.10.1016/j.future.2019.06.004Search in Google Scholar
[21] Wang L, Wu Y, Xie J, Wu S, Wu Z. Energy-efficient Hadoop for big data analytics and computing: A systematic review and research insights. Future Gener Computer Syst. 2018;86:1351–67.10.1016/j.future.2017.11.010Search in Google Scholar
[22] Reshma G, Al-Atroshi C, Nassa VK, Geetha B, Sunitha G, Galety MG, et al. Deep learning-based skin lesion diagnosis model using dermoscopic images. Intell Autom Soft Comput. 2022;31(1):621–34.10.32604/iasc.2022.019117Search in Google Scholar
[23] Kamalraj R, Neelakandan S, Kumar MR, Rao VC, Anand R, Singh H. Interpretable filter based convolutional neural network (IF-CNN) for glucose prediction and classification using PD-SS algorithm. Measurement. 2021;183:109804. 10.1016/j.measurement.2021.109804.Search in Google Scholar
[24] Zhang M, Long D, Qin T, Yang J. A chaotic hybrid butterfly optimization algorithm with particle swarm optimization for high-dimensional optimization problems. Symmetry. 2020;12(11):1800.10.3390/sym12111800Search in Google Scholar
[25] Chen Y, Zhu Q, Xu H. Finding rough set reducts with fish swarm algorithm. Knowl Syst. 2015;81:22–9.10.1016/j.knosys.2015.02.002Search in Google Scholar
[26] Gugnani S, Khanolkar D, Bihany T, Khadilkar N. Rule based classification on a multi node scalable Hadoop cluster. In International Conference on Internet and Distributed Computing Systems. Cham: Springer; 2014, September. p. 174–83.10.1007/978-3-319-11692-1_15Search in Google Scholar
[27] https://www.kaggle.com/uciml/pima-indians-diabetes-database.Search in Google Scholar
© 2022 Hanumanthu Bhukya and Sadanandam Manchala, published by De Gruyter
This work is licensed under the Creative Commons Attribution 4.0 International License.
Articles in the same Issue
- Research Articles
- Construction of 3D model of knee joint motion based on MRI image registration
- Evaluation of several initialization methods on arithmetic optimization algorithm performance
- Application of visual elements in product paper packaging design: An example of the “squirrel” pattern
- Deep learning approach to text analysis for human emotion detection from big data
- Cognitive prediction of obstacle's movement for reinforcement learning pedestrian interacting model
- The application of neural network algorithm and embedded system in computer distance teach system
- Machine translation of English speech: Comparison of multiple algorithms
- Automatic control of computer application data processing system based on artificial intelligence
- A secure framework for IoT-based smart climate agriculture system: Toward blockchain and edge computing
- Application of mining algorithm in personalized Internet marketing strategy in massive data environment
- On the correction of errors in English grammar by deep learning
- Research on intelligent interactive music information based on visualization technology
- Extractive summarization of Malayalam documents using latent Dirichlet allocation: An experience
- Conception and realization of an IoT-enabled deep CNN decision support system for automated arrhythmia classification
- Masking and noise reduction processing of music signals in reverberant music
- Cat swarm optimization algorithm based on the information interaction of subgroup and the top-N learning strategy
- State feedback based on grey wolf optimizer controller for two-wheeled self-balancing robot
- Research on an English translation method based on an improved transformer model
- Short-term prediction of parking availability in an open parking lot
- PUC: parallel mining of high-utility itemsets with load balancing on spark
- Image retrieval based on weighted nearest neighbor tag prediction
- A comparative study of different neural networks in predicting gross domestic product
- A study of an intelligent algorithm combining semantic environments for the translation of complex English sentences
- IoT-enabled edge computing model for smart irrigation system
- A study on automatic correction of English grammar errors based on deep learning
- A novel fingerprint recognition method based on a Siamese neural network
- A hidden Markov optimization model for processing and recognition of English speech feature signals
- Crime reporting and police controlling: Mobile and web-based approach for information-sharing in Iraq
- Convex optimization for additive noise reduction in quantitative complex object wave retrieval using compressive off-axis digital holographic imaging
- CRNet: Context feature and refined network for multi-person pose estimation
- Improving the efficiency of intrusion detection in information systems
- Research on reform and breakthrough of news, film, and television media based on artificial intelligence
- An optimized solution to the course scheduling problem in universities under an improved genetic algorithm
- An adaptive RNN algorithm to detect shilling attacks for online products in hybrid recommender system
- Computing the inverse of cardinal direction relations between regions
- Human-centered artificial intelligence-based ice hockey sports classification system with web 4.0
- Construction of an IoT customer operation analysis system based on big data analysis and human-centered artificial intelligence for web 4.0
- An improved Jaya optimization algorithm with ring topology and population size reduction
- Review Articles
- A review on voice pathology: Taxonomy, diagnosis, medical procedures and detection techniques, open challenges, limitations, and recommendations for future directions
- An extensive review of state-of-the-art transfer learning techniques used in medical imaging: Open issues and challenges
- Special Issue: Explainable Artificial Intelligence and Intelligent Systems in Analysis For Complex Problems and Systems
- Tree-based machine learning algorithms in the Internet of Things environment for multivariate flood status prediction
- Evaluating OADM network simulation and an overview based metropolitan application
- Radiography image analysis using cat swarm optimized deep belief networks
- Comparative analysis of blockchain technology to support digital transformation in ports and shipping
- IoT network security using autoencoder deep neural network and channel access algorithm
- Large-scale timetabling problems with adaptive tabu search
- Eurasian oystercatcher optimiser: New meta-heuristic algorithm
- Trip generation modeling for a selected sector in Baghdad city using the artificial neural network
- Trainable watershed-based model for cornea endothelial cell segmentation
- Hessenberg factorization and firework algorithms for optimized data hiding in digital images
- The application of an artificial neural network for 2D coordinate transformation
- A novel method to find the best path in SDN using firefly algorithm
- Systematic review for lung cancer detection and lung nodule classification: Taxonomy, challenges, and recommendation future works
- Special Issue on International Conference on Computing Communication & Informatics
- Edge detail enhancement algorithm for high-dynamic range images
- Suitability evaluation method of urban and rural spatial planning based on artificial intelligence
- Writing assistant scoring system for English second language learners based on machine learning
- Dynamic evaluation of college English writing ability based on AI technology
- Image denoising algorithm of social network based on multifeature fusion
- Automatic recognition method of installation errors of metallurgical machinery parts based on neural network
- An FCM clustering algorithm based on the identification of accounting statement whitewashing behavior in universities
- Emotional information transmission of color in image oil painting
- College music teaching and ideological and political education integration mode based on deep learning
- Behavior feature extraction method of college students’ social network in sports field based on clustering algorithm
- Evaluation model of multimedia-aided teaching effect of physical education course based on random forest algorithm
- Venture financing risk assessment and risk control algorithm for small and medium-sized enterprises in the era of big data
- Interactive 3D reconstruction method of fuzzy static images in social media
- The impact of public health emergency governance based on artificial intelligence
- Optimal loading method of multi type railway flatcars based on improved genetic algorithm
- Special Issue: Evolution of Smart Cities and Societies using Emerging Technologies
- Data mining applications in university information management system development
- Implementation of network information security monitoring system based on adaptive deep detection
- Face recognition algorithm based on stack denoising and self-encoding LBP
- Research on data mining method of network security situation awareness based on cloud computing
- Topology optimization of computer communication network based on improved genetic algorithm
- Implementation of the Spark technique in a matrix distributed computing algorithm
- Construction of a financial default risk prediction model based on the LightGBM algorithm
- Application of embedded Linux in the design of Internet of Things gateway
- Research on computer static software defect detection system based on big data technology
- Study on data mining method of network security situation perception based on cloud computing
- Modeling and PID control of quadrotor UAV based on machine learning
- Simulation design of automobile automatic clutch based on mechatronics
- Research on the application of search algorithm in computer communication network
- Special Issue: Artificial Intelligence based Techniques and Applications for Intelligent IoT Systems
- Personalized recommendation system based on social tags in the era of Internet of Things
- Supervision method of indoor construction engineering quality acceptance based on cloud computing
- Intelligent terminal security technology of power grid sensing layer based upon information entropy data mining
- Deep learning technology of Internet of Things Blockchain in distribution network faults
- Optimization of shared bike paths considering faulty vehicle recovery during dispatch
- The application of graphic language in animation visual guidance system under intelligent environment
- Iot-based power detection equipment management and control system
- Estimation and application of matrix eigenvalues based on deep neural network
- Brand image innovation design based on the era of 5G internet of things
- Special Issue: Cognitive Cyber-Physical System with Artificial Intelligence for Healthcare 4.0.
- Auxiliary diagnosis study of integrated electronic medical record text and CT images
- A hybrid particle swarm optimization with multi-objective clustering for dermatologic diseases diagnosis
- An efficient recurrent neural network with ensemble classifier-based weighted model for disease prediction
- Design of metaheuristic rough set-based feature selection and rule-based medical data classification model on MapReduce framework
Articles in the same Issue
- Research Articles
- Construction of 3D model of knee joint motion based on MRI image registration
- Evaluation of several initialization methods on arithmetic optimization algorithm performance
- Application of visual elements in product paper packaging design: An example of the “squirrel” pattern
- Deep learning approach to text analysis for human emotion detection from big data
- Cognitive prediction of obstacle's movement for reinforcement learning pedestrian interacting model
- The application of neural network algorithm and embedded system in computer distance teach system
- Machine translation of English speech: Comparison of multiple algorithms
- Automatic control of computer application data processing system based on artificial intelligence
- A secure framework for IoT-based smart climate agriculture system: Toward blockchain and edge computing
- Application of mining algorithm in personalized Internet marketing strategy in massive data environment
- On the correction of errors in English grammar by deep learning
- Research on intelligent interactive music information based on visualization technology
- Extractive summarization of Malayalam documents using latent Dirichlet allocation: An experience
- Conception and realization of an IoT-enabled deep CNN decision support system for automated arrhythmia classification
- Masking and noise reduction processing of music signals in reverberant music
- Cat swarm optimization algorithm based on the information interaction of subgroup and the top-N learning strategy
- State feedback based on grey wolf optimizer controller for two-wheeled self-balancing robot
- Research on an English translation method based on an improved transformer model
- Short-term prediction of parking availability in an open parking lot
- PUC: parallel mining of high-utility itemsets with load balancing on spark
- Image retrieval based on weighted nearest neighbor tag prediction
- A comparative study of different neural networks in predicting gross domestic product
- A study of an intelligent algorithm combining semantic environments for the translation of complex English sentences
- IoT-enabled edge computing model for smart irrigation system
- A study on automatic correction of English grammar errors based on deep learning
- A novel fingerprint recognition method based on a Siamese neural network
- A hidden Markov optimization model for processing and recognition of English speech feature signals
- Crime reporting and police controlling: Mobile and web-based approach for information-sharing in Iraq
- Convex optimization for additive noise reduction in quantitative complex object wave retrieval using compressive off-axis digital holographic imaging
- CRNet: Context feature and refined network for multi-person pose estimation
- Improving the efficiency of intrusion detection in information systems
- Research on reform and breakthrough of news, film, and television media based on artificial intelligence
- An optimized solution to the course scheduling problem in universities under an improved genetic algorithm
- An adaptive RNN algorithm to detect shilling attacks for online products in hybrid recommender system
- Computing the inverse of cardinal direction relations between regions
- Human-centered artificial intelligence-based ice hockey sports classification system with web 4.0
- Construction of an IoT customer operation analysis system based on big data analysis and human-centered artificial intelligence for web 4.0
- An improved Jaya optimization algorithm with ring topology and population size reduction
- Review Articles
- A review on voice pathology: Taxonomy, diagnosis, medical procedures and detection techniques, open challenges, limitations, and recommendations for future directions
- An extensive review of state-of-the-art transfer learning techniques used in medical imaging: Open issues and challenges
- Special Issue: Explainable Artificial Intelligence and Intelligent Systems in Analysis For Complex Problems and Systems
- Tree-based machine learning algorithms in the Internet of Things environment for multivariate flood status prediction
- Evaluating OADM network simulation and an overview based metropolitan application
- Radiography image analysis using cat swarm optimized deep belief networks
- Comparative analysis of blockchain technology to support digital transformation in ports and shipping
- IoT network security using autoencoder deep neural network and channel access algorithm
- Large-scale timetabling problems with adaptive tabu search
- Eurasian oystercatcher optimiser: New meta-heuristic algorithm
- Trip generation modeling for a selected sector in Baghdad city using the artificial neural network
- Trainable watershed-based model for cornea endothelial cell segmentation
- Hessenberg factorization and firework algorithms for optimized data hiding in digital images
- The application of an artificial neural network for 2D coordinate transformation
- A novel method to find the best path in SDN using firefly algorithm
- Systematic review for lung cancer detection and lung nodule classification: Taxonomy, challenges, and recommendation future works
- Special Issue on International Conference on Computing Communication & Informatics
- Edge detail enhancement algorithm for high-dynamic range images
- Suitability evaluation method of urban and rural spatial planning based on artificial intelligence
- Writing assistant scoring system for English second language learners based on machine learning
- Dynamic evaluation of college English writing ability based on AI technology
- Image denoising algorithm of social network based on multifeature fusion
- Automatic recognition method of installation errors of metallurgical machinery parts based on neural network
- An FCM clustering algorithm based on the identification of accounting statement whitewashing behavior in universities
- Emotional information transmission of color in image oil painting
- College music teaching and ideological and political education integration mode based on deep learning
- Behavior feature extraction method of college students’ social network in sports field based on clustering algorithm
- Evaluation model of multimedia-aided teaching effect of physical education course based on random forest algorithm
- Venture financing risk assessment and risk control algorithm for small and medium-sized enterprises in the era of big data
- Interactive 3D reconstruction method of fuzzy static images in social media
- The impact of public health emergency governance based on artificial intelligence
- Optimal loading method of multi type railway flatcars based on improved genetic algorithm
- Special Issue: Evolution of Smart Cities and Societies using Emerging Technologies
- Data mining applications in university information management system development
- Implementation of network information security monitoring system based on adaptive deep detection
- Face recognition algorithm based on stack denoising and self-encoding LBP
- Research on data mining method of network security situation awareness based on cloud computing
- Topology optimization of computer communication network based on improved genetic algorithm
- Implementation of the Spark technique in a matrix distributed computing algorithm
- Construction of a financial default risk prediction model based on the LightGBM algorithm
- Application of embedded Linux in the design of Internet of Things gateway
- Research on computer static software defect detection system based on big data technology
- Study on data mining method of network security situation perception based on cloud computing
- Modeling and PID control of quadrotor UAV based on machine learning
- Simulation design of automobile automatic clutch based on mechatronics
- Research on the application of search algorithm in computer communication network
- Special Issue: Artificial Intelligence based Techniques and Applications for Intelligent IoT Systems
- Personalized recommendation system based on social tags in the era of Internet of Things
- Supervision method of indoor construction engineering quality acceptance based on cloud computing
- Intelligent terminal security technology of power grid sensing layer based upon information entropy data mining
- Deep learning technology of Internet of Things Blockchain in distribution network faults
- Optimization of shared bike paths considering faulty vehicle recovery during dispatch
- The application of graphic language in animation visual guidance system under intelligent environment
- Iot-based power detection equipment management and control system
- Estimation and application of matrix eigenvalues based on deep neural network
- Brand image innovation design based on the era of 5G internet of things
- Special Issue: Cognitive Cyber-Physical System with Artificial Intelligence for Healthcare 4.0.
- Auxiliary diagnosis study of integrated electronic medical record text and CT images
- A hybrid particle swarm optimization with multi-objective clustering for dermatologic diseases diagnosis
- An efficient recurrent neural network with ensemble classifier-based weighted model for disease prediction
- Design of metaheuristic rough set-based feature selection and rule-based medical data classification model on MapReduce framework