Epileptic EEG patterns recognition through machine learning techniques and relevant time–frequency features

Sahbi Chaibi; Chahira Mahjoub; Wadhah Ayadi; Abdennaceur Kachouri

doi:10.1515/bmt-2023-0332

Enjoy 40% off

academic books on De Gruyter Brill *

Article Publicly Available

Epileptic EEG patterns recognition through machine learning techniques and relevant time–frequency features

Sahbi Chaibi , Chahira Mahjoub , Wadhah Ayadi and Abdennaceur Kachouri

Published/Copyright: October 30, 2023

Published by

Become an author with De Gruyter Brill

Submit Manuscript Author Information

From the journal Biomedical Engineering / Biomedizinische Technik Volume 69 Issue 2

Abstract

Objectives

The present study is designed to explore the process of epileptic patterns’ automatic detection, specifically, epileptic spikes and high-frequency oscillations (HFOs), via a selection of machine learning (ML) techniques. The primary motivation for conducting such a research lies mainly in the need to investigate the long-term electroencephalography (EEG) recordings’ visual examination process, often considered as a time-consuming and potentially error-prone procedure, requiring a great deal of mental focus and highly experimented neurologists. On attempting to resolve such a challenge, a number of state-of-the-art ML algorithms have been evaluated and compare in terms of performance, to pinpoint the most effective algorithm fit for accurately extracting epileptic EEG patterns.

Content

Based on intracranial as well as simulated EEG data, the attained findings turn out to reveal that the randomforest (RF) method proved to be the most consistently effective approach, significantly outperforming the entirety of examined methods in terms of EEG recordings epileptic-pattern identification. Indeed, the RF classifier appeared to record an average balanced classification rate (BCR) of 92.38 % in regard to spikes recognition process, and 78.77 % in terms of HFOs detection.

Summary

Compared to other approaches, our results provide valuable insights into the RF classifier’s effectiveness as a powerful ML technique, fit for detecting EEG signals born epileptic bursts.

Outlook

As a potential future work, we envisage to further validate and sustain our major reached findings through incorporating a larger EEG dataset. We also aim to explore the generative adversarial networks (GANs) application so as to generate synthetic EEG signals or combine signal generation techniques with deep learning approaches. Through this new vein of thought, we actually preconize to enhance and boost the automated detection methods’ performance even more, thereby, noticeably enhancing the epileptic EEG pattern recognition area.

Keywords: epilepsy; iEEG signal; HFO; spike; machine learning; random forest

Introduction

Since its inception in the 1950s, artificial intelligence (AI) has been defined as a collection of technologies aimed to replicate the fundamental cognitive processes of human intelligence, including perception, interpretation, and decision-making [1]. It is applicable in a wide range of domains, particularly the healthcare area, wherein, it displays a remarkable medical-service quality boosting potential. AI is recognized to play a potentially central role in revolutionizing the future of medicine, from facilitating computer-aided surgeries, intelligent prosthetics, remote patient monitoring, early disease diagnosis, to therapy monitoring [2]. More recently, advancements recorded in data availability, computing power, statistical and signal processing theories have significantly propelled the area of AI, particularly, in terms of applying machine learning (ML) to diagnose such neurological disorders as Alzheimer’s, Parkinson’s, sleep disorders, and epilepsy [3], [4], [5], [6], [7]. In effect, neurological disorders often exhibit distinctive patterns, notably observable in electroencephalographic (EEG) signals and magnetoencephalographic (MEG) recordings [8]. In this respect, the present study focuses on investigating and identifying epileptic bursts, involving two primary abnormalities dubbed epileptic spikes (or sharp waves) and epileptic high frequency oscillations (HFOs). Epileptic spikes (or sharp waves) are characterized with brief, high-amplitude electrical discharges noticeable in EEG or MEG recordings. More specifically, they exhibit a duration span ranging from 20 to 200 ms [9], as defined by the International Federation of Societies for Electroencephalography and Clinical Neurophysiology [9], [10], [11]. As to the HFOs, they are defined as spontaneous rapidly oscillating patterns of high-frequency range, specifically set within a frequency rate ranging from 80 to 500 Hz, and could persist for no less than three cycles [12], [13], [14]. More recently, however, very-high-frequency oscillations (VHFOs, 500–1,000 Hz) and ultra-high-frequency oscillations (UHFOs, 1,000–2,000 Hz) [15] have also been reported to persist. Noteworthy, also, is that seizure segments are also often deployed to analyze the epilepsy related mechanisms rather than spikes and HFOs [16, 17], primarily involving epileptic spikes and HFOs traces like patterns [18] that bear potential manifold clinical utility. Firstly, these segments have recently emerged as promising biomarkers and strong contenders for epilepsy pre-surgical diagnosis [19]. Secondly, they have remarkably contributed in accurately localizing the seizure onset zone (SOZ) for refractory epilepsy affected patients. The removal of pathological spikes, as well as the HFOs generating tissue, has demonstrated an effective correlation with improved surgical outcomes and increased seizure-free results. Besides, epileptic patterns are considered as promising potential seizure-occurrence predicting biomarkers, thereby, promoting life quality for pharmaco-resistant epilepsy suffering individuals. Similarly, epileptic bursts have also been applied to assess the effectiveness of medication treatment. In addition, epileptic patterns have noticeably contributed in deeply understanding the pathophysiological and cerebral mechanisms involved in generating epileptic seizures [20, 21]. Initially, visual assessment was frequently used in early epileptic patterns related studies to identify the EEG electrodes’ recorded abnormalities [22]. Yet, the visual identification and marking of epileptic bursts in EEG signal forms has currently been considered a highly challenging process, particularly regarding HFO cases, owing mainly to their low EEG signaling amplitudes. Actually, such a visual procedure turns out to demonstrate three major limitations. Indeed, not only does it display poor inter-rater agreement, due to the reviewer’s perception influenced subjectivity, but it also stands as a time-consuming process, susceptible to the interpreter’s acquired experience and skill levels [17, 23]. Despite these persistent challenges, however, visual inspection has been widely considered as the “ground truth” or “gold standard” benchmark, useful for evaluating most of the automated detection algorithms respective performance. To address the visual inspection associated shortcomings, however, a high demand for an effective automated detection of these patterns has been perceived, particularly for thorough epileptic patterns elaborated studies to be achieved. To this end, a wide range of seizures, HFOs and spikes detection approaches have been developed and validated, providing various accuracy levels, predominantly tailored to fit the requirements of individual research centers. Table 1, below, provides a summary of the various ML-based state-of-the-art approaches developed for the epileptic patterns detection purposes.

Table 1:

Overview of the ML based state-of-the-art epileptic pattern detection methods.

Authors, publication year, reference	ML classifier	Epileptic pattern mode
Elham bagheri et al., 2018, [24]	Support vector machines (SVM)	Spikes
Nesrine Jrad et al., 2016, [25]		HFOs
Rekha Sahu et al., 2020, [26]		Seizures
Fatma Krikid et al., 2022, [27]		HFOs
Payal Khanwani et al., 2010, [28]	Muli-layer perceptron (MLP)	Spikes
Dümpelmann M et al., 2012, [29]		HFOs
Ummara Ayman et al., 2023, [30]		Seizures
Ilakiyaselvan N et al., 2020, [31]	Bayesian neural networks (BNN)	Seizures
Ummara Ayman et al., 2023, [30]	Bayesian neural networks (BNN)	Seizures
A. Quintero-Rincón et al., 2017, [32]	K-nearest neighbor (KNN)	Spikes
Ilakiyaselvan N et al., 2020, [31]		Seizures
Ummara Ayman et al., 2023, [30]		Seizures
Sahbi Chaibi et al., 2014, [23]	Decision tree (DT)	HFOs
Sadeem Nabeel et al., 2022, [33]		Seizures
Ummara Ayman et al., 2023, [30]		Seizures
Xiashuang Wang et al., 2019, [22]	Random forest (RF)	Seizures
Sadeem Nabeel et al., 2022, [33]	Random forest (RF)	Seizures
Rekha Sahu et al., 2020, [34]	AdaBoost (ABs)	Seizures
Sadeem Nabeel et al., 2022, [33]	Logistic regression (LR)	Seizures
A. Quintero-Rincón et al., 2017, [32]	Quadratic discriminant analysis (QDA)	Seizures
Ummara Ayman et al., 2023, [30]	Gradient boosting (GB)	Seizures
Rekha Sahu et al., 2020, [34]	Extra tree (ET)	Seizures

With respect to the present study, a special spikes and high-frequency oscillations (HFOs) joint detection architecture is put forward, using a number of machine learning approaches already documented in some of the previously published healthcare related studies. Moreover, we aim to establish an experimental comparison between these approaches, applying intracranial EEG data as a benchmark criterion. In addition, we consider incorporating noisy intracranial EEG data with signal-to-noise ratios (SNR) ranging from −10 to 20 dB, for a thorough evaluation and an effective validation of the advanced detection methods provided performance. Our main objective lies in identifying the most robust machine-learning technique likely to simultaneously extract epileptic anomalies from both intracranial as well as noisy EEG data. The paper is organized as follows. Following the introduction, Section 2 is devoted to depicting the applied clinical database and gold standard, highlighting some of the applied and compared machine learning (ML) techniques. The section also involves a number of performance metrics used to evaluate the different examined methods. As for Sections 3 and 4, we provide a comprehensive summary of the comparative results major drawn findings and practical implications. Finally, the ultimate section bears the main reached conclusions, along with potential venues for future research.

Materials and methods

Epileptic patterns: database and visualization

As part of the present study, we seek to evaluate and compare a selection of ML approaches so far available. For this purpose, we consider using the same clinical database already applied in our previously conducted studies [23, 35, 36]. More specifically, the intracranial EEG database was recorded at the Montreal Neurological Institute and Hospital (MNI) in Canada. The database contains EEG signals from an epileptic patient with intractable epilepsy. The recorded iEEG signals were sampled at a frequency of 2 KHz and involve 24 bipolar iEEG channels recorded during inter-ictal periods from the deepest contacts targeting the bilateral mesial temporal lobe (MTL) structures. During the recording process, a low-pass anti-aliasing filter with a cutoff frequency of 500 Hz was applied. For the experiments and simulations to be effectively conducted, a selection of techniques, approaches, methodologies, and codings were implemented using the Anaconda-Python environment and the Google Colab-based cloud platform. Noteworthy also, is that in our study context, both of the spikes and HFOs identification processes have been performed by an experienced neurologist’s visual annotation. Actually, the visual inspection of the intracranial EEG recording has revealed the presence of 416 distinguishable epileptic HFOs and 269 clearly identifiable epileptic spikes in total. These visually marked results should serve as a benchmark or gold standard, useful for evaluating the compared ML models respective performances. In addition to the experts marked intracranial EEG data, we have also incorporated simulated scalp EEG recordings to evaluate the tested ML approaches’ effectiveness in terms of Gaussian noise effect. Nonetheless, scalp EEG recordings often suffer from the persistence of various artefacts and inconsistencies, likely to reduce their reliability as gold standards. To address such a problem and establish a rather dependable benchmark effectively replicating the output of scalp EEG, we have considered introducing Gaussian noise to our intracranial EEG data, to be combined with the data at signal-to-noise ratios (SNR), ranging from 20 to −10 dB. Indeed, this methodology has enabled the establishment of a consistent and reliable benchmark, useful for validating the investigated ML models’ performance [36, 37]. To mitigate the potential bias associated with examining specific data, we have undertaken to apply the following criteria to select the appropriate channels useful for conducting the present research. Firstly, we have ensured that the channels turn out to display distinct interictal HFOs and spikes upon initial review. Secondly, only channels showcasing both frequent and infrequent epileptic events have been considered to capture a comprehensive range of activity. Moreover, the channels with varying background levels have been selected to incorporate diversity. It is worth mentioning, at this level, that the patient has provided informed consent in compliance with the MNI research ethics board.

Main parts of epileptic patterns’ recognition via machine learning techniques

Overall, the proposed methodology involves the following stages: data pre-processing, data splitting, feature extraction, model training, hyperparameter optimization, and model evaluation using testing data, which depict the automatic detection process of epileptic patterns by means of machine learning techniques. It is worth recalling that ML stands for a popular supervised learning model that initially involves a training phase followed by a testing phase. It is actually at this level of research that a breakdown of the methodology takes place. Indeed, while the first step (Step 1) involves a processing of the EEG data, the second step (Step 2) involves the EEG data splitting process, wherein, the available data is divided into training and testing subsets using k-fold cross-validation techniques. In effect, such a procedure is intended to ensure that the model is actually well trained and evaluated in terms of separate data. As to Step 3, it refers to the features’ extraction stage. Once data are split, relevant features are computed and extracted out of each labeled EEG event. Capturing significant characteristics of epileptic patterns, these features could include statistical measures, a spectral analysis, a time–frequency analysis, or any other temporal measures likely to help collect important information from the EEG signals. After the features are extracted, the dataset is transformed into a special format fit for being processed via ML algorithms. It is now represented under the form of a matrix or a set of vectors, wherein, each row represents an EEG event and each column stands for a single feature. Regarding Step 4, it designates the training phase, wherein, the ML model is trained on the EEG labeled data, which include both the extracted features and the corresponding class labels indicating whether an event is actually epileptic or not. The model is made to learn to recognize patterns and make predictions based on the input features. The ML algorithm applied in this specific methodology is aimed to construct the most effective ML model fit for classifying EEG data into two binary classes: (HFO, background) and (spike, background). Concerning Step 5, it involves the hyperparameters’ optimization process. It is actually by carefully tuning these hyperparameters that researchers and practitioners usually strive to improve the ML models’ accuracy, robustness, and ability to generalize latent data for the sake of effectively detecting epileptic patterns. Then comes Step 6, or the model evaluation stage, which follows the ML model training and optimization process, to evaluate its performance as to the previously detected latent data, often referred to as test data. This step serves to ensure the model’s generalization capacity and assess its capability in detecting epileptic patterns. Our suggested ML based methodology is illustrated through Figure 1, below, depicting a straightforward flowchart of the steps involved in the automatic detection of epileptic patterns via machine learning techniques. Ultimately, it is necessary to carefully evaluate the ML model’s suitability and effectiveness, along with its applicability to specific datasets, as a final step of the ML process.

Figure 1:

Flowchart illustrating the advanced ML methodology.

Step 1: EEG data pre-processing step

The EEG data preprocessing stage is critical for the automatic detection of epileptic patterns. It involves preparing, organizing, cleaning, and enhancing (spectrum equalization) the EEG data quality, to facilitate the extraction of meaningful insights. As part of our study, several preprocessing tasks have been performed on raw EEG data, including the denoising and filtering procedures. For relevant features to be extracted, four signal preprocessing domains have been utilized: temporal space, Fourier-frequency domain analysis, Hilbert space, and time–frequency map, each offering unique insights into the EEG data. In our context, denoising techniques are applied in the temporal space to remove unwanted artifacts and noise from the EEG signals, thereby, ensuring that the subsequent analysis is actually focused on the relevant patterns. Regarding the Fourier-frequency domain analysis, our analysis is focused on the EEG signals’ frequency components, to help identify the specific frequency ranges associated with epileptic bursts. As for the high-frequency oscillations (HFOs) detection, an 80–500 Hz range bandpass filter has been deployed, as suggested by a number of published studies [14, 35, 37], and for the spikes recognition purposes, the bandpass filter is set to range between 30 and 70 Hz, as recommended in [35]. At the level of Hilbert space, however, the Hilbert transform has been applied to the filtered EEG data, to help in instantaneously drawing frequency, phase, and amplitude relevant information, likely to provide valuable insights into the epileptic patterns’ underlying mechanisms. In regard to the time–frequency mapping process, the focus of our analysis is laid on the temporal changes of the EEG signals’ frequency content, to help detect any frequency-varying patterns associated with epileptic activity. On implementing these preprocessing operators in multiple domains, we envision to further enhance/enhancing the EEG data quality and collect valuable data and insights, critically useful for the automatic detection and analysis of epileptic patterns.

Step 2: Features’ selection and extraction

A thorough observation of epileptic pattern behavior through EEG data requires an accurate feature extraction procedure for an effective capture of relevant characteristics to be achieved. In regard to our devised methodology, a comprehensive set of features have been compiled from various domains, as detailed earlier, to represent the EEG signals under a feature matrix form. Widely utilized in EEG classification tasks and clinical contexts, these selected features have demonstrated promising horizons in recognizing distinctive and identifiable EEG patterns. As illustrated on Table 2, the category, name, and mathematical equation computed for each of our EEG dataset labeled pattern, including spikes, HFOs, and background activities, have been depicted. More specifically, a sum of 13 features have been extracted from the four set distinct domains. In the temporal domain, such characteristics as the mean, standard deviation, power, zero-crossing rate, and Shannon entropy of the filtered EEG signal have been considered. These temporal features should provide valuable information as to the EEG patterns’ time-domain behavior. As for the spectral frequency-based characteristics, we have considered implementing the mean of the Fast Fourier Transform (FFT) spectrum, the number of FFT born peaks, along with their mean associated value. These features should help in capturing the EEG signals associated frequency-domain properties. From the Hilbert domain, we have undertaken to extract the mean of the Hilbert instantaneous amplitude, phase and frequency, as features enabling to provide instantaneous insights into the amplitude, phase, and frequency variations of the EEG patterns over time. In addition, special time–frequency mapping drawn characteristics have also been outlined, including the time–frequency map normalized energy, the normalized number of non-zero pixels, and the number of maximums or “peaks” depicted in the time–frequency representation. To note, the peaks’ detection procedure in the time frequency representation has been maintained via the complex MORLET wavelet [14, 35], enabling to effectively analyze the epileptic EEG activities associated time–frequency characteristics. By incorporating these various features, our aim has been to capture various epileptic-pattern aspects persistent in the EEG signals, owing to the valuable insights they provide regarding the EEG patterns’ behavior related temporal, spectral, Hilbert, and time–frequency properties. For the purpose of further refining the set features, we have considered applying the recursive feature elimination (RFE) technique to the final feature matrix, to help highlight the most significantly informative characteristics and eliminate the irrelevant ones [38, 39]. Indeed, such a process serves to maintain that the selected features are actually the most effectively relevant and fit for classifying epileptic patterns. Hence, by implementing the RFE approach, we aim to optimize the features’ set to maintain an accurately efficient recognition of the EEG data latent epileptic patterns.

Table 2:

A depiction of the implemented EEG patterns’ highlighting features.

	Feature	Name	Mathematical equation
Temporal features	F1	Mean of filtered EEG signal	1 N ∑ i = 1 N s [ i ]
	F1	Mean of filtered EEG signal	s[i] denotes an individual value in the filtered EEG signal and n is its length in samples
	F2	Standard deviation of filtered EEG signal	∑ i = 1 N ( S i − S ‾ ) 2 N
	F2	Standard deviation of filtered EEG signal	s[i] is an individual value in the filtered EEG signal and N is its length in samples, while S ‾ denotes the signal mean.
	F3	Power of filtered EEG signal	1 N ∑ k = 1 N s k 2
	F3	Power of filtered EEG signal	s[i] is an individual value in the filtered EEG signal and N is its length in samples.
	F4	Zero-crossing rate (ZCR)	1 N − 1 ∑ n = 1 N − 1 a b s ( s i g n [ s ( n ) ] − s i g n [ s ( n + 1 ) ] 2 )
	F4	Zero-crossing rate (ZCR)	s designates the filtered EEG signal and N its length in samples. sign(a) = +1 if a≥0, and −1 if a<0.
	F5	Shannon entropy	− ∑ i = 1 N P ( sym [ i ] ) × log 2 ( P ( sym [ i ] ) )
	F5	Shannon entropy	Where: P(sym[i]) is the probability of the outcome or symbol sym[i] and n is the length of different symbols in the filtered EEG signal
Frequency features	F6	Mean of FFT spectrum	M FFT = 1 N ( ∑ n = 0 N − 1 s ( n ) e − j 2 π nk N )
	F6	Mean of FFT spectrum	s[n] is an individual temporal value in the filtered EEG signal and N is the length of FFT spectrum components.
	F7	Peaks number in FFT spectrum	Peaks = maxi ( ∑ n = 0 N − 1 s ( n ) e − j 2 π nk N )
	F8	Mean amplitude of peaks number	1 N ∑ i = 1 N Peaks i
	F8	Mean amplitude of peaks number	N is the length of detected peaks.
Hilbert features	F9	Mean of Hilbert instantaneous amplitude (MHA)	x_a(n) = x(n) + j × H[x(n)], where H[ ] represents the Hilbert operator and x(n) is the filtered EEG signal MHA = 1 N ∑ n = 1 N \| x a ( n ) \| .
	F10	Mean of Hilbert instantaneous phase (MHP)	MHP = 1 N ∑ n = 1 N arg [ x a ( n ) ] .
	F11	Mean of Hilbert instantaneous frequency (MHF)	MHF = 1 ( N − 1 ) * 2 π ∑ n = 1 N − 1 arg [ x a ( n + 1 ) ] − arg [ x a ( n ) ] .
Time–frequency based features.	F12	Normalized energy of time–frequency map	1 NM [ ∑ i = 1 N ∑ j = 1 M \| s ( i , j ) \| ]
	F12	Normalized energy of time–frequency map	Where M and N are the rows and columns of the map and S is the T–F map
	F13	Normalized number of local maxima (peaks)	1 NM [ ∑ i = 1 V \| Peaks i \| ]
	F13	Normalized number of local maxima (peaks)	Where: M and N are the rows and columns numbers of the TF map, while V is the total number of 2-D peaks detected in the TF map

Step 3: Training-test split with cross-validation

Cross-validation is a critically effective statistical technique, widely applied to assess the machine learning models’ processes efficiency rate. It is particularly useful for treating cases wherein the collected dataset is small or limited, posing challenges for accurate modeling procedures. With respect to our study context, the k-fold cross-validation approach has been deployed, wherein; k has been set to 10. More specifically, for rather comprehensive analysis purposes, we have considered splitting our study data into ten equal-sized subgroups. For each iteration, our learning model has been trained on nine subsets, among which, one subset has been applied for testing purposes, i.e., 90 % of the applied data have been applied during each iteration to train the ML model, while the remaining 10 % have been used to evaluate the epileptic spikes and HFOs’ prediction accuracy rates. Once the training phase has been achieved, the optimized ML algorithm’s structure is applied to automatically recognize the relevant HFOs and spikes. This recognition process involves using a sliding window of 150 ms, enabling to scan the entirety of the EEG dataset. This particular window size has been specifically selected to fit appropriately for measuring the database HFOs and spike events’ average duration. For the sake of determining whether a sample turns out to belong to the (HFO, spike) class (labeled 1), or to the background class (labeled 0), the sliding window proceeds by exclusively processing a single sample at a time. The spike or HFO class designating windows have been assigned the value one, while a zero value has been assigned to the remaining segments. After scanning the entire EEG signals, the final step involves combining and storing all the unit-value bearing segments, highlighting an epileptic EEG burst probability.

Machine-learning algorithms

ML is a data-driven process, liable to potentially model rather complex EEG data patterns, as compared to conventional methods. ML algorithms have been applied to draw significant information from EEG data for the purpose of distinguishing the various epilepsy associated brain statuses, and accurately identifying the epileptogenic zone. These algorithms utilize computed features to train ML classifiers, enabling them to identify the various data born EEG patterns, and make relevant predictions based on newly observed measurements. More recently, several machine learning methods have been widely used in this respect. Worth citing among these methods, one could well mention the support vector machines (SVM), multi-layer perceptron (MLP), Gaussian Naive Bayes (GNB), K nearest neighbors (KNN), decision tree (DT), AdaBoost (ABt), random forest (RF), logistic regression (LR), quadratic discriminant analysis (QDA), Gradient boosting (GB), and extra trees (ET).

The support vector machines (SVM): it is a widely recognized popular supervised learning algorithm fit for dealing with both of the classification (SVC) and regression (SVR) associated problems. SVM serves to retrieve an optimal hyperplane within a multidimensional space enabling to effectively separate the relevant features. In this respect, two primary types of SVC kernels are commonly used: the linear kernel, and the radial basis function (RBF) kernel. In regard to our specific study context, a preliminary test has demonstrated that the RBF kernel proves to perform rather effectively than the kernel option. Noteworthy, also, is that the RBF kernel involves two crucial parameters: regularized coefficient C and gamma. While gamma designates the kernel spread process, influencing the decision region, the C coefficient stands for the SVC classifier specific parameter representing a penalty for misclassifying a data point [40, 41]. With respect to our study case, the gamma value is set to a fixed value of 0.01, while the C parameter is set to the value of 1,000. Our choice of these two parameter values relates to our previously conducted research reference findings relevant to EEG patterns detection.

Multi-layer perceptron (MLP): The multi-layer perceptron (MLP) is characterized with the wide range of parameters it displays, the most significant among which one could cite the hidden layers’ size, the activation function, the weight optimization related solver, and the learning rate. The MLP associated challenge, however, lies in determining the number of hidden layers, whereby, information could flow easily from the input layer to the output layer, as well as the number of neurons persistent in each hidden layer. While it might seem that more hidden layers could provide more features and yield better classification results, a practical limit still persists. Indeed, increasing the number of hidden layers could result in the emergence of a serious overfitting problem, therefrom errors, such as false positives. Unfortunately, there is no established theory or straightforward method to determine the optimal architecture in terms of number of hidden layers and neurons in each layer. Inversely, however, using an insufficient number of hidden layers would result in an underfitting issue, wherein, the MLP turns out to be unable to model complex data, leading to poor performance. It is therefore essential to conduct a validation process to select the most appropriately fit MLP architecture [42, 43]. Accordingly, our administered preliminary test has revealed that the optimal internal architecture fit for processing the neural network should involve a single input layer, a single output layer, along with two hidden layers respectively enclosing 10 and five neurons. As to the implemented activation function, frequently applied in the relevant literature, it has been the relu. Regarding weight optimization procedure, we have opted for a learning rate of 0.0001 and the Adam solver.

Gaussian Naïve Bayes (Bayesian neural network: NN): in this respect, the wide range of applied features are ranked under the set of F= {F₁, F₂ … , F_N}, while the relating classes have been determined as C= {C₁, C₂ … , C_M}. At this level, the Naive Bayes theorem [44] turns out to be computed via the following equation:

(1) Posterior probabilty = P ( C j | F i ) = P ( F i | C j ) . P ( C j ) P ( F i ) = likelihood ∗ prior evidence

As the classification context is of a binary type (0,1), the set probabilities of class 0 and class 1 are equally computed as follows:

(2) P ( C 1 ) = P ( C 2 ) = Number of epileptic patterns ( spike or HFO ) Total number of events ( spike + Background or HFO + Background ) = 0.5

As the evidence P(F_i)is exclusively applied for the probability normalization purposes, it is then dropped and considered constant. Subsequently, for each pair (F_i, C_j) the sample mean µ_ij and standard deviation σ_ijare computed. In regard to the testing phase, a new test point bearing the features of vector F_iis classified into a specific class C_jonce the likelihood probability product turns out to be maximized as:

(3) Π [ P ( F i | C j ) ] = Π ( 1 2 π σ ij 2 e ( F i − µ ij ) 2 2 σ ij 2 )

K nearest neighbors (KNN): as one of the simplest ML techniques, it assumes that similar things are too close to each other. In this respect, the selected K value and distance metric stand as two major considerations when using the KNN technique, which rests on a number of steps: Firstly, select a K number of neighbors, and there is no predefined statistical method applicable for determining the most optimally favorable value of K. However, using error curves for various K values related data training and validation purposes could help in effectively determining the optimal K value. With respect to our particular study context, the optimal K value has been fixed at the range of three. Then, we proceed with computing the Euclidean distance of the chosen K number of neighbors is, as maintained through the following equation:

(4) Euclidean dist ( d ) = ( X 2 − X 1 ) 2 + ( y 2 − y 1 ) 2

Take the K nearest neighbors as per the calculated Euclidean distance. Among these k neighbors, count the number of data points in each category. Finally, assign the new test point to that category for which the number of neighbors is the maximum [44], [45], [46].

Decision tree (DT): A decision tree is an important and well-established machine learning technique that has been used for a wide range of applications, especially for classification problems. Initially, the tree starts with a root node, preceded with a series of branches with intersections dubbed nodes, and ends up with leaves each corresponding to a class to predict. Tree depth is referred to as the maximum number of nodes persisting prior to reaching a leaf node. Within the binary classification context, the hierarchical structure of a decision tree rests on computing the information gain between the source S and feature A, such as:

(5) Gain(S,A) = Entropy(dataset)-entropy (feature) = Entropy ( S ) − ∑ i = 1 symbols P ( S i ) E n t r o p y ( S i )

Computation of the Shannon entropy is performed via the following equation:

(6) Entropy ( S ) = − p + log 2 ( p + ) − p − log 2 ( p − )

wherein, S designates the samples’ set, p+ denotes the proportion of positive samples and p-refers to the proportion of negative samples, while P(S_i) denotes the probability of a persistent feature A associated symbol. Noteworthy, however, is that the Gini index might also be used instead of the entropy. It is determined by deducting the squared probabilities’ sum of each class from the unit. Mathematically, the Gini index is expressed as:

(7) Gini ( n ) = 1 − ∑ i = 1 n p i 2

Where: P_i denotes the element’s probability for being classified under a distinct class. Thus, entropy turns out to be a rather complex procedure to undertake, owing mainly to its logarithms’ implementation requirement, therefore, the Gini index computation process proves to be a rather prompt undertaking [23, 47].

The AdaBoost (ABt) classifier: it proceeds with constructing a certain number of decision trees, N, throughout the training process. As a first step, the algorithm initially undertakes to construct a decision tree, wherein, any misclassified instances turn out to be identified as errors by the modeling procedure, to be subsequently utilized as inputs. This iterative process is reiterated until errors are effectively minimized, and data are accurately predicted. Hence, N models, or decision trees, are established based on the identified errors. This principle of utilizing error-correcting models stands as a common process; frequently applied by all types of boosting algorithms [48]. With respect to our special study case, the optimal number of estimators or decision trees has been set to twenty.

Random forest (RF): It is a supervised ML algorithm widely used for classification and regression problems solving purposes. It proceeds by constructing multiple decision trees on different samples of the dataset. Rather than relying on a single decision tree, the RF undertakes to combine the entirety of the trees emanating predictions to construct a final majority-vote based prediction. Once faced with a new test point, the algorithm undertakes to determine each decision tree relevant prediction and assigns the new data point to the majority of votes receiving category. Thus, increasing the number of trees in the random forest modeling process generally helps in improving accuracy and mitigating overfitting issues [44, 49, 50]. In regard to our study case, we have considered setting up the optimal number of estimators or decision trees to twenty.

Logistic regression (LR): it is a linear regression model fit for modeling regression tasks, though inappropriate for classification modeling purposes. Noteworthy, however, is that the LR stands as a powerful statistical algorithm specifically designed for predicting binary classes [51]. It helps in modeling the probability associated with two possible outcomes, and is commonly used for dealing with binary classification problems. It can also be further extended to handle multi-class classification tasks using the “one vs. all” framework. Rather than directly predicting classes, the LR undertakes to compute the probability of an event’s occurrence, thus providing a valuable tasks’ classification framework.

Discriminant analysis (DA): DA generally provides a three-fold analysis [52], respectively dubbed as a linear discriminant analysis (LDA), a quadratic discriminant analysis (QDA), and a regularized linear discriminant (RDA) analysis. In the context of our study case, the administered preliminary test reveals well that the QDA appears to be rather suited and effectively performing in detecting epileptic patterns. It is worth recalling, in this respect, that the QDA can be drawn from simple probabilistic models enabling to compute the class distribution conditional upon the data P(X|y=k)regarding each class k. Subsequently, predictions can be determined for each feature by means of the Bayes rule:

(8) P ( y = k | x ) = P ( x | y = k ) . P ( y = k ) / P ( x )

Gradient boosting (GB): Gradient boosting classifiers stand for a set of machine learning algorithms that combine multiple weak learning models, creating a strong predictive model. Each model in the sequence learns to correct the previous model made errors. The entirety of the models emanating predictions are then combined using simple averaging statistics. Actually, the GB models associated popularity is owed mainly to their noticeable effectiveness in classifying complex datasets [53], [54], [55], wherein, regression trees represent the base estimators even in classification tasks. Comparatively, however, the AdaBoost was the first boosting algorithm specifically designed with a proper loss function. In effect, gradient boosting is a rather generic algorithm allowing to retrieve approximate solutions to the additive modeling problem, rendering it more effectively flexible than the AdaBoost framework. Concerning our study case, an optimal number of 50 estimators or decision trees has been opted for.

Extra trees (ET) algorithm: it is a widely used random-forest type of machine learning algorithm that helps in jointly combining a set of multiple decision trees’ derived predictions. Although ET algorithm applies a rather simple decision-tree constructing approach, compared to random forest, it is often liable to achieve similar or even more effective performance, through its ability to create several unpruned decision trees from a single training dataset [56]. Regarding regression tasks, predictions are made by averaging the decision trees drawn predictions, while majority-voting process is used for classification task purposes. Concerning our specific study case, the number of estimators, or decision trees, has been set to 80 as the ideal option.

The ML models’ performance evaluation

Several applicable metrics are available for assessing the ML approaches achieved performance. Regarding the EEG analysis context, however, the gold standard is typically applied as the spikes and high-frequency oscillations (HFOs) provided EEG data relating annotations, skilfully performed by an experimented neurologist or epileptologist. On assessing the performance of automated detection algorithms, the detection results need be compared to the visual inspection results, which entails calculating the true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) parameters respective values. The true positives (TPs) rates correspond to the number of cases wherein both of the algorithm and the neurologist appear to correctly detect the epileptic pattern. As to the false positives (FPs), they refer to the number of cases wherein the algorithm proves to wrongly identify a background activity as an epileptic pattern, correctly identified by the expert. Concerning the true negatives (TNs), they designate the number of cases wherein both of the expert and the algorithm turn out to correctly identify the epileptic background activity. As regards the false negatives (FNs), they highlight the cases wherein the algorithm mistakenly classifies an epileptic pattern as a background activity, correctly identified by the specialist expert. Based on these four parameters, various evaluative metrics [57] turn out to be applicable to assess the various ML classifiers respective performance, such as:

(9) Sensitivity = TP TP + FN

Accordingly, the sensitivity metric serves to measure the algorithms’ capacity level to accurately detect, identify and classify any persistent spikes or high-frequency oscillations (HFOs), by quantifying the algorithm’s capability in effectively capturing such events, while ensuring that they are not missed or falsely identified. Hence:

(10) Specificity = TN TN + FP

The specificity metric provides useful information regarding the algorithm’s capability to accurately identify true negative events. More specifically; the balanced classification rate (BCR), or balanced accuracy metric, provides insights as to the algorithm’s ability in detecting the epileptic patterns and rejecting the non-epileptic patterns in the ensuing detection results, particularly, on dealing with unbalanced data. In fact, balanced accuracy denotes a metric reflecting the average value between sensitivity and specificity [57], and is computed as:

(11) BCR = 1 2 [ TP TP + FN + TN TN + FP ]

It is important to note that the BCR yielded value ranges from 0 to 100 %. Accordingly, a balanced accuracy of 100 % indicates a perfect classification performance, wherein, both sensitivity and specificity are actually maximized.

Results

The recently published studies [14, 36, 37] have demonstrated that visual inspection of spikes and high-frequency oscillations (HFOs) in EEG signals offers valuable insights into different aspects of the epileptic seizures’ underlying mechanisms, including the localization of the seizure onset zone (SOZ). Nevertheless, this manual or visual marking procedure turns out to be a time-consuming procedure, susceptible to prevailing errors, which makes the development of an effectively automated HFOs and spikes’ detection tool a crucial necessity for the systematic analysis of epileptic patterns, to serve as potentially useful and reliable clinical biomarkers of epilepsy. As far as the present research is concerned, our focus of interest is laid on depicting the epileptic patterns’ identification process, namely, spikes and HFOs, through the implementation of various existing machine learning approaches, specifically designed to handle and cope with diverse healthcare scenarios. As an initial step, we have considered applying the recursive feature elimination (RFE) approach to select the most appropriate HFOs’ detection characteristics, thereby, identifying the filtered EEG signal’s power rate, its associated standard deviation, and average Hilbert instantaneous amplitude, as useful features. As for the spikes’ detection procedure, we have undertaken to select the filtered EEG signal and the Hilbert instantaneous amplitude respective means as discriminative features. Table 3, below, depicts the performance evaluation rates recorded by the various ML classifiers, tested for spikes and HFOs recognition effectiveness in terms of specified selective features, measured through the sensitivity, specificity and BCR metrics.

Table 3:

Established comparison between the different ML classifiers’ recorded performance.

Classifier	Sensitivity, %		Specificity, %		BCR, %
Classifier	Spikes	HFOs	Spikes	HFOs	Spikes	HFOs
SVM	90.69	96.35	89.45	91.68	90.07	94.01
MLP	87.98	85.88	94.53	92.90	91.25	89.39
BNN	95.73	98.54	64.45	80.92	80.09	89.73
KNN	87.98	94.40	93.75	91.44	90.86	92.92
DT	93.18	90.69	91.93	95.31	92.55	93.00
RF	88.75	96.35	95.31	92.66	92.03	94.50
ABt	88.37	95.86	95.70	91.68	92.03	93.77
LR	94.18	94.40	85.15	90.46	89.66	92.43
QDA	94.57	98.05	74.21	77.75	84.39	87.90
GB	89.14	96.35	95.31	92.42	92.22	94.38
ET	89.53	93.43	90.23	93.39	89.88	93.41

Bold values represent the best statistically significant results.

As highlighted through Table 3, and on comparing the various ML approaches scored results, one could well highlight the following remarks. First, The Gaussian Naive Bayes classifier turns out to record the highest spike and HFO identification sensitivity scores. Second, the decision tree (DT) classifier appears to score the highest HFO identification specificity. Moreover; the AdaBoost classifier proves to stand as the most highly reliable spike recognition method. Additionally, the BCR performance the random forest (RF) classifier turns out to reveal the highest HFO detection related performance, recording a BCR metric score of 94.50 %. As to spike detection, the DT classifier is discovered to noticeably outperform all the other examined methods, recording a BCR score of 92.55 %. The BCR metric plays a crucial role in conducting our study, as it serves to compare the classifiers’ overall performance in detecting epileptic EEG patterns. It helps in simultaneously maximizing both of the sensitivity and specificity factors. To ensure the validity of our already achieved findings, we have undertaken to re-examine the investigated machine learning (ML) techniques displayed performances using simulated scalp EEG data with different signal-to-noise ratios (SNR). This special novel methodology is applied to evaluate the ML methods’ resilience when exposed to additional Gaussian noise associated with intracranial EEG data. This approach is widely recognized as an efficient tool, highly useful for effectively comparing algorithms within the context of EEG data analysis [25, 36, 37]. It is worth highlighting, in this regard, that the typical rate of SNR values in human scalp EEG usually ranges between 20 and −10 dB [25, 36, 37], therefore, the incorporated simulated EEG data selected signal-to-noise ratios (SNR) are set to bear levels ranging from 20 to −10 dB. Noteworthy, also, is that even though scalp EEG data are extremely important in evaluating the effectiveness of ML techniques, they are susceptible to artifacts, such as eye movement and muscular artifacts, which might well compromise their reliability as a gold standard. A useful alternative might therefore lie in applying intracranial EEG data mixed with Gaussian noise to serve as a reliable benchmark for assessing the various ML techniques associated performance. At this study level, we consider it necessary to examine the ML classifiers respective behavior under various levels of signal-to-noise ratio (SNR). In this context, BCR metric values have been computed with respect to each SNR level. Specifically, wherein, SNR values are set to range from 20 to −10 dB, in conformity with the typical ranges of scalp EEG noise levels. A sample illustration of the scalp EEG signals’ simulation with varying SNR levels is depicted through Figure 2, below.

Figure 2:

Distinct EEG signals synchronization with varying noise levels.

At this level, we consider investigating the various ML methods respective performances in presence of Gaussian noise. For each signal-to-noise ratio (SNR), the epileptic patterns’ (spike, HFO) detection associated BCR performance values have been drawn by averaging the BCR results over 10 trials. Both of Figure 3 and Table 4 display the investigated methods’ results reached in terms of automated spikes detection at various SNR levels. The BCR values have been compared to determine the impact of noise on the accuracy of epileptic patterns’ detection via each ML method. Our initial results turn out to indicate well that the RF-based method proves to record the highest BCR average value across all SNR levels, highlighting its high-level reliability in terms of spike detection process. In addition, Table 4 also reveals that the RF-based approach appears to score the highest BCR average value at all SNR levels, making it the most efficient HFOs identification method. Using both of the iEEG and noise associated data, the attained results tend to reveal that the RF method turns out to perform exceptionally well in terms of HFOs and spikes detecting process.

Figure 3:

Spikes and HFOs detection attained BCR values achieved via various SNR and ML techniques. Left: spikes relevant BCR curves; right: BCR curves for HFOs.

Table 4:

BCR performances with and without noise.

ML method		SVM	MLP	BNN	KNN	DT	ABt	RF	LR	QDA	GB	ET
BCR value using noisy EEG data	Spikes	73.49	76.70	90.45	84.90	91.62	88.49	92.73	81.745	90.49	81.74	89.72
BCR value using noisy EEG data	HFOs	57.67	56.19	58.05	58.9	59.55	62.20	63.77	57.00	60.13	60.65	61.45
BCR value using intracranial iEEG data	Spikes	90.07	91.25	80.09	90.86	92.55	92.03	92.03	89.66	84.39	92.22	89.88
BCR value using intracranial iEEG data	HFOs	94.01	89.39	89.73	92.92	93.00	94.50	93.77	92.43	87.90	94.38	93.41
Average BCR value using iEEG + noisy data	Spikes	81.78	83.97	85.27	87.88	92.08	90.26	*92.38*	85.70	87.44	86.98	89.8
Average BCR value using iEEG + noisy data	HFOs	75.84	72.79	73.89	75.91	76.27	78.35	*78.77*	74.71	74.01	77.51	77.43

Bold values represent the best statistically significant results. Values in italics represent the average results between BCR value using noisy EEG data and BCR value using intracranial iEEG data.

Figure 4 displays a graphical representation of the proposed approach, showcasing the RF classifier’s highly effective performance and accuracy in detecting iEEG signals latent spike events. Accordingly, three true positive spikes have been correctly identified, while one false positive detection has been proved to persist, liable to elimination by duration thresholding via a post-treatment step. Similarly, Figure 4 also provides another illustration of the proposed RF classifier’s remarkable accuracy in identifying high-frequency oscillations (HFOs) and successful detection of the entirety of the EEG segment persistent HFO events.

Figure 4:

A screenshot of the RF classifier achieved spikes and HFOs detection results (left: spikes detection case, right: HFOs detection case).

Discussion

In sum, the ML models do not seem to contribute in the epileptic patterns detection process on an equal basis. Overall, the RF classifier proves to stand as one of the most efficient designs, effectively useful for simultaneously recognizing spikes and HFOs patterns in EEG signals. Indeed, it has consistently demonstrated high average BCR values when used within an intracranial EEG and noisy EEG data context. Noteworthy, however, is that the achieved BCR performance value (78.77 %), associated with HFOs detection, has been discovered to be fairly poor. Actually, two main high-frequency oscillations (HFOs) detection related drawbacks have been retained. The first relates to the HFOs’ low amplitudes, making them less sensitive to algorithmic applications. As to the second drawback, it relates to the issue of HFOs’ false detection, as elaborately addressed in [35], a study thoroughly examining the outcomes of three commonly used filtering methods. In effect, spurious frequency components, primarily resulting from the filtering of transient activities, such as sharp spikes without HFOs and artifacts, might well result in spurious HFOs detection. With respect to the spikes detection context, also, the achieved BCR value (92.38 %) does not seem to reflect a highly significant performance. False spike detection could be owed to the persistence of rhythmic EMG and eye-blink associated artifacts [58]. Still, achieving a 100 % BCR score, by perfectly detecting the entirety of the EEG signals persistent true HFOs and spikes, while rejecting all background activities, turns out to be an extremely challenging task, as trade-off between sensitivity and specificity proves to be a typically unavoidable issue. As a conclusion, and given the fact that an individual method might not single-handedly suffice to effectively eliminate all false detections, integrating multiple machine learning (ML) approaches into a single hybrid model [59, 60] could well provide a rather effective solution, whereby, the false high-frequency oscillations (HFOs) and false spikes detection associated challenges could be efficiently addressed. Such an integrative architecture, we reckon, should help remarkably in determining whether the various devised methodologies achieved outcomes do actually jointly agree and converge, or disagree and diverge, based on a voting process evaluative framework. Moreover, implementing a morphological filtering procedure, as an EEG signals’ pre-processing stage, would also provide a further significant solution, whereby, the currently achieved results could be noticeably boosted [27].

Conclusions & perspectives

As a time-consuming, subjective, and potentially error-prone research area, visual monitoring of epileptic EEG patterns in epilepsy recordings poses several challenges, particularly, regarding the scalp EEG data visualization process. It is in this context that the present study can be set, with the aim of supporting neuroscientists and biomedical engineers on studying epileptic biomarkers, particularly in a context marked with a rising demand for sensitive, specific, accurate, and effective techniques, whereby, epileptic patterns in EEG signals could be effectively identified. In this respect, advanced machine learning (ML) techniques have emerged as valuable tools to help in swiftly and accurately detecting epileptic patterns in EEG recordings, highly surpassing the limitations of visual screening by neurologists. Indeed, ML-based methods have proven to be beneficial not only in diagnosing the epileptogenic zone (EZ) and detecting epilepsy, but also in enhancing the understanding of the functional relevance of high-frequency oscillations (HFOs) and spikes, both in normal and pathological brain processes. In this regard, the present research undertakes to assess 11 commonly used ML methods specifically designed for epileptic patterns recognition purposes. Actually, a number of important key findings have been reached, after evaluating the robustness of various approaches. First, the RF classifier has been discovered to stand as the most successfully effective method for detecting epileptic patterns in EEG signals. Yet, even though the RF classifier appears to outperform other ML methods, recording the scores of 92.38 % for spikes recognition and 78.77 % for HFOs classification, the need to improve such performances remains still imposed, with the aim of maximally reducing false detection of spurious HFOs and false spikes in EEG. To cope with these challenges, we envision integrating the entirety of the relevant ML algorithms into a single hybrid model, whereby, determine the various advanced methodologies agreement and/or disagreement elements could be efficiently highlighted and resolved via an effective voting process. As a potential future work, we envisage to further validate and sustain our major reached findings through incorporating a larger EEG dataset. We also aim to explore the generative adversarial networks (GANs) application so as to generate synthetic EEG signals or combine signal generation techniques with deep learning approaches. Through this new vein of thought, we actually preconize to enhance and boost the automated detection methods’ performance even more, thereby, noticeably enhancing the epileptic EEG pattern recognition area.

Corresponding author: Sahbi Chaibi, AFD2E Laboratory, National Engineering School, Sfax University, Sfax, Tunisia; and Faculty of Sciences of Monastir, Monastir University, Monastir, Tunisia, Phone: +21655385405; E-mail: sahbi.chaibi@fsm.rnu.tn

Acknowledgments

The authors of this study would like to express their gratitude to the staff at the Montreal Neurological Institute and Hospital of Canada for kindly providing the database used in the present study to assess and compare the performance metrics of all tried-and-true machine learning techniques. Additionally, we would like to express our sincere gratitude to Pr. Mohamed Dogui, who served as the previous Chief of Service for Functional Exploration of the Nervous System at CHU Sousse for assisting us in visual marking process of both HFOs and spikes patterns in intracranial EEG signals.

Research ethics: Not applicable.
Informed consent: Not applicable.
Author contributions: M. Sahbi Chaibi is the corresponding author, who collected the EEG data and wrote the paper. Ms. Chahira Mahjoub checked the English correction of differents parts of paper. M. Abdennaceur Kachouri followed the supervision of the present framework.
Competing interests: The authors state no conflict of interest.
Research funding: None declared.
Data availability: The raw data can be obtained on request from the corresponding author.

References

1. Mason, J. Perception, interpretation and decision-making understanding gaps between competence and performance. Int J Math Educ 2016;48. https://doi.org/10.1007/s11858-016-0764-1.Search in Google Scholar

2. Briganti, G, Le Moine, O. Artificial intelligence in medicine: today and tomorrow. Front Med 2020;7:27. https://doi.org/10.3389/fmed.2020.00027.Search in Google Scholar PubMed PubMed Central

3. Abdulhay, E, Arunkumar, N, Narasimhan, K, Vellaiappan, E, Venkatraman, V. Gait and tremor investigation using machine learning techniques for the diagnosis of Parkinson disease. Future Generat Comput Syst 2018;83:366–73. https://doi.org/10.1016/j.future.2018.02.009.Search in Google Scholar

4. Mirzaei, G, Adeli, H. Machine learning techniques for diagnosis of alzheimer disease, mild cognitive disorder, and other types of dementia. Biomed Signal Process Control 2022;72. https://doi.org/10.1016/j.bspc.2021.103293.Search in Google Scholar

5. Rahman, MJ, Mahajan, R, Morshed, B. Exacerbation in obstructive sleep apnea: early detection and monitoring using a single channel EEG with quadratic discriminant analysis. In: 9th International IEEE/EMBS conference on neural engineering; 2019:85–8 pp.10.1109/NER.2019.8717054Search in Google Scholar

6. Siddiqui, MK, Morales-Menendez, R, Huang, X, Hussain, N. A review of epileptic seizure detection using machine learning classifiers. Brain Inform 2020;7:5. https://doi.org/10.1186/s40708-020-00105-1.Search in Google Scholar PubMed PubMed Central

7. Abdulhamit, S, Jasmin, K, Canbaz, MA. Epileptic seizure detection using hybrid machine learning methods. Neural Comput Appl 2019;31:317–25. https://doi.org/10.1007/s00521-017-3003-y.Search in Google Scholar

8. Guo, J, Xiao, N, Li, H, He, L, Li, Q, Wu, T, et al.. Transformer-based high-frequency oscillation signal detection on magnetoencephalography from epileptic patients. Front Mol Biosci 2022;9:822810. https://doi.org/10.3389/fmolb.2022.822810.Search in Google Scholar PubMed PubMed Central

9. Puspita, J, Soemarno, G, Jaya, AI, Soewono, E. Interictal Epileptiform Discharges (IEDs) classification in EEG data of epilepsy patients. J Phys Conf 2017;943:012030. https://doi.org/10.1088/1742-6596/943/1/012030.Search in Google Scholar

10. Zacharaki, EI, Mporas, I, Garganis, K, Megalooikonomou, V. Spike pattern recognition by supervised classification in low dimensional embedding space. Brain Inform 2016;3:73–83. https://doi.org/10.1007/s40708-016-0044-4.Search in Google Scholar PubMed PubMed Central

11. Abd El-samie, FE, Alotaiby, TN, Khalid, MI, Alshebeili, SA, Aldosari, SA. A review of EEG and MEG epileptic spike detection algorithms. IEEE Access 2018;6:60673–88. https://doi.org/10.1109/ACCESS.2018.2875487.Search in Google Scholar

12. Boran, E, Stieglitz, L, Sarnthein, J. Epileptic high-frequency oscillations in intracranial EEG are not confounded by cognitive tasks. Front Hum Neurosci 2021;15:613125. https://doi.org/10.3389/fnhum.2021.613125.Search in Google Scholar PubMed PubMed Central

13. Pail, M, Cimbalnik, J, Roman, R, Daniel, P, Shaw, DJ, Chrastina, J, et al.. High frequency oscillations in epileptic and non-epileptic human hippocampus during a cognitive task. Sci Rep 2020;10:18147. https://doi.org/10.1038/s41598-020-74306-3.Search in Google Scholar PubMed PubMed Central

14. Khalilov, I, Le Van Quyen, M, Gozlan, H, Ben-Ari, Y. Epileptogenic actions of GABA and fast oscillations in the developing hippocampus. Neuron 2005;48:787–96. https://doi.org/10.1016/j.neuron.2005.09.026.Search in Google Scholar PubMed

15. Wang, Y, Zhou, D, Yang, X, Xu, X, Ren, L, Yu, T, et al.. Expert consensus on clinical applications of high-frequency oscillations in epilepsy. Acta Epileptologica 2020;2:8. https://doi.org/10.1186/s42494-020-00018-w.Search in Google Scholar

16. Fiaidhi, J, Wadiwala, T, Trikha, V. Analyzing brain signals to predict seizure events using machine learning technique. Int J Biosci Biotechnol 2020;12:35–46. https://doi.org/10.21742/IJBSBT.2020.12.1.05.Search in Google Scholar

17. Slimen, IB, Boubchir, L, Mbarki, Z, Seddik, H. EEG epileptic seizure detection and classification based on dual-tree complex wavelet transform and machine learning algorithms. J Biomed Res 2020;34:151–61. https://doi.org/10.7555/JBR.34.20190026.Search in Google Scholar PubMed PubMed Central

18. Sagi, V, Steven Evans, M. Relationship between high-frequency oscillations and spikes in a case of temporal lobe epilepsy. Epilepsy Behav Case Rep 2016;6:10–12. https://doi.org/10.1016/j.ebcr.2016.04.006.Search in Google Scholar PubMed PubMed Central

19. Ahmad, MA, Ayaz, Y, Jamil, M, Omer Gillani, S, Rasheed, MB, Imran, M, et al.. Comparative analysis of classifiers for developing an adaptive computer-assisted EEG analysis system for diagnosing epilepsy 2015;2015:638036. https://doi.org/10.1155/2015/638036.Search in Google Scholar PubMed PubMed Central

20. Zijlmans, M, Jacobs, J, Zelmann, R, Dubeau, F, Gotman, J. High-frequency oscillations mirror disease activity in patients with epilepsy. Neurology 2009;72:979–86. https://doi.org/10.1212/01.wnl.0000344402.20334.81.Search in Google Scholar PubMed PubMed Central

21. Roehri, N, Pizzo, F, Lagarde, S, Lambert, I, Nica, A, McGonigal, A, et al.. High-frequency oscillations are not better biomarkers of epileptogenic tissues than spikes. Ann Neurol 2018;83:84–97. https://doi.org/10.1002/ana.25124.Search in Google Scholar PubMed

22. Wang, X, Gong, G, Li, N, Qiu, S. Detection analysis of epileptic EEG using a novel random forest model combined with grid search optimization. Front Hum Neurosci 2019;13:52. https://doi.org/10.3389/fnhum.2019.00052.Search in Google Scholar PubMed PubMed Central

23. Chaibi, S, Lajnef, T, Samet, M, Jerbi, K, Kachouri, A. Detection of High Frequency Oscillations (HFOs) in the 80–500 Hz range in epilepsy recordings using decision tree analysis. In: International image processing, applications and systems conference; 2014.10.1109/IPAS.2014.7043321Search in Google Scholar

24. Bagheri, E, Jin, J, Dauwels, J, Cash, S, Westover, MB. Classifier cascade to aid in detection of epileptiform transients in interictal EEG. In: Proc IEEE Int Conf Acoust Speech Signal Process; 2018:970–4 pp.10.1109/ICASSP.2018.8461992Search in Google Scholar PubMed PubMed Central

25. Jrad, N, Kachenoura, A, Merlet, I, Wendling, F. Automatic detection and classification of high-frequency oscillations in depth-EEG signals. In: IEEE transactions on bio-medical engineering; 2016:1 p.10.1109/EMBC.2015.7318427Search in Google Scholar PubMed

26. Sahu, R, Dash, SR, Cacha, LA, Poznanski, RR, Parida, S. Epileptic seizure detection: a comparative study between deep and traditional machine learning techniques. J Integr Neurosci 2020;19:1–9. https://doi.org/10.31083/j.jin.2020.01.24.Search in Google Scholar PubMed

27. Chaibi, S, Lajnef, T, Sakka, Z, Samet, M, Kachouri, A. A reliable approach to distinguish between transient with and without HFOs using TQWT and MCA. J Neurosci Methods 2014;232:36–46. https://doi.org/10.1016/j.jneumeth.2014.04.025.Search in Google Scholar PubMed

28. Khanwani, P, Sridhar, S, Vijaylakshmi, K. Automated event detection of epileptic spikes using neural networks. Int J Comput Appl 2010;2. https://doi.org/10.5120/660-928.Search in Google Scholar

29. Dümpelmann, M, Jacobs, J, Kerber, K, Schulze-Bonhage, A. Automatic 80–250 Hz “ripple” high frequency oscillation detection in invasive subdural grid and strip recordings in epilepsy by a radial basis function neural network. Clin Neurophysiol 2012;123:1721–31. https://doi.org/10.1016/j.clinph.2012.02.072.Search in Google Scholar PubMed

30. Ayman, U, Zia, MS, Okon, OD, Rehman, N-U, Meraj, T, Ragab, AE, et al.. Epileptic patient activity recognition system using extreme learning machine method. Biomedicines 2023;11:816. https://doi.org/10.3390/biomedicines11030816.Search in Google Scholar PubMed PubMed Central

31. Ilakiyaselvan, N, Nayeemulla Khan, A, Shahina, A. Deep learning approach to detect seizure using reconstructed phase space images. J Biomed Res 2020;34:240–50. https://doi.org/10.7555/JBR.34.20190043.Search in Google Scholar PubMed PubMed Central

32. Quintero-Rincón, A, Prendes, J, Muro, V, D’Giano, C. Study on spike-and-wave detection in epileptic signals using T-location-scale distribution and the K-nearest neighbors’ classifier. In: 2017 IEEE URUCON. Montevideo, Uruguay; 2017:1–4 pp.10.1109/URUCON.2017.8171869Search in Google Scholar

33. Kbah, SNS, Al-Qazzaz, NK, Jaafer, SH, Sabir, MK. Epileptic EEG activity detection for children using entropy-based biomarkers. Neurosci Inform 2022;2:100101. https://doi.org/10.1016/j.neuri.2022.100101.Search in Google Scholar

34. Sahu, R, Ranjan Dash, S, Cacha, LA, Poznanski, RR, Parida, S. Epileptic seizure detection: a comparative study between deep and traditional machine learning techniques. J Integr Neurosci 2020;19:1–9. https://doi.org/10.31083/j.jin.2020.01.24.Search in Google Scholar PubMed

35. Chaibi, S, Mahjoub, C, Krikid, F, Karfoul, A, Jeannes, R, Kachouri, A. Pitfalls of spikes filtering for detecting high frequency oscillations (HFOs). In: 18th international multi-conference on systems, signals & devices; 2021:427–32 pp.10.1109/SSD52085.2021.9429411Search in Google Scholar

36. Chaibi, S, Lajnef, T, Ghrob, A, Samet, M, Kachouri, A. A robustness comparison of two algorithms used for EEG spike detection. Open Biomed Eng J 2015;9:151–6. https://doi.org/10.2174/1874120701509010151.Search in Google Scholar PubMed PubMed Central

37. Krikid, F, Karfoul, A, Chaibi, S, Kachenoura, A, Nica, A, Kachouri, A, et al.. Classification of high frequency oscillations in intracranial EEG signals based on coupled time-frequency and image-related features. Biomed Signal Process Control 2022;73:103418. https://doi.org/10.1016/j.bspc.2021.103418.Search in Google Scholar

38. Brownlee, J. Recursive feature elimination (RFE) feature selection in Python; 2020. Tutorial in machine learning mastery.Search in Google Scholar

39. Yin, Z, Wang, Y, Liu, L, Zhang, W, Zhang, J. Cross-subject EEG feature selection for emotion recognition using transfer recursive feature elimination. Front Neurorob 2017;11:19. https://doi.org/103389/fnbot.2017.00019.10.3389/fnbot.2017.00019Search in Google Scholar PubMed PubMed Central

40. Shi, M, Wang, C, Li, XZ, Li, M-Q, Wang, L, Xie, N-G. EEG signal classification based on SVM with improved squirrel search algorithm. Biomed Eng/Biomedizinische Technik 2021;66:137–52. https://doi.org/10.1515/bmt-2020-0038.Search in Google Scholar PubMed

41. Abenna, S, Nahid, M, Bouyghf, H. An enhanced EEG prediction system for motor cortex-imagery tasks using SVM. In: 10th international conference on innovation, modern applied science & environmental studies; 2022.10.1051/e3sconf/202235101026Search in Google Scholar

42. Sridhar, GV, Mallikarjuna Rao, P. A neural network approach for EEG classification in BCI. Int J Comput Sci Telecommun 2012;3. https://api.semanticscholar.org/CorpusID:44915049.Search in Google Scholar

43. Bird, JJ, Kobylarz, J, Faria, DR, Ekárt, A, Ribeiro, EP. Cross-domain MLP and CNN transfer learning for biological signal processing: EEG and EMG. IEEE Access 2020;54789–801. https://doi.org/10.1109/ACCESS.2020.2979074.Search in Google Scholar

44. Lestari, F, Haekal, M, Edison, RE, Fauzy, FR, Khotimah, SN, Haryanto, F. Epileptic seizure detection in EEGs by using random tree forest, Naïve Bayes and KNN classification. J Phys Conf 2020;1505:012055. https://doi.org/10.1088/1742-6596/1505/1/012055.Search in Google Scholar

45. Bablani, A, Edla, DR, Dodia, S. Classification of EEG data using k-nearest neighbor approach for concealed information test. Procedia Comput Sci 2018;143:242–9. https://doi.org/10.1016/j.procs.2018.10.392.Search in Google Scholar

46. Li, M, Xu, H, Liu, X, Lu, S. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification. Technol Health Care 2018;26:1–11. https://doi.org/10.3233/THC-174836.Search in Google Scholar PubMed PubMed Central

47. Bastos, NS, Marques, BP, Adamatti, D, Billa, C. Analyzing EEG signals using decision trees: a study of modulation of amplitude. Comput Intell Neurosci 2020;4:1–11. https://doi.org/10.1155/2020/3598416.Search in Google Scholar PubMed PubMed Central

48. Hu, J. Automated detection of driver fatigue based on AdaBoost classifier with EEG signals. Front Comput Neurosci 2017;11:72. https://doi.org/10.3389/fncom.2017.00072.Search in Google Scholar PubMed PubMed Central

49. Wang, X, Gong, G, Li, N, Qiu, S. Detection analysis of epileptic EEG using a novel random forest model combined with grid search optimization. Front Hum Neurosci 2019;13:52. https://doi.org/10.3389/fnhum.2019.00052.Search in Google Scholar PubMed PubMed Central

50. Sugianela, Y, Sutino, QL, Herumurti, D. EEG classification for epilepsy based on wavelet packet decomposition and random forest. Jurnal Ilmu Komputer dan Informasi 2018;11:27. https://doi.org/10.21609/jiki.v11i1.549.Search in Google Scholar

51. Pan, C, Shi, C, Mu, H, Li, J, Gao, X. EEG-based emotion recognition using logistic regression with Gaussian kernel and Laplacian prior and investigation of critical frequency bands. Appl Sci 2020;10:1619. https://doi.org/10.3390/app10051619.Search in Google Scholar

52. Ghojogh, B, Crowley, M. Linear and quadratic discriminant analysis. Tutorials; 2019. arXiv preprint arXiv:1906.02590.Search in Google Scholar

53. Sunaryono, D, Sarno, R, Siswantoro, J. Gradient boosting machines fusion for automatic epilepsy detection from EEG signals based on wavelet features. J King Saud Univ – Comput Inf Sci 2022;34:9591–607. https://doi.org/10.1016/j.jksuci.2021.11.015.Search in Google Scholar

54. Asjid Tanveer, M, Salman, A. Epileptic seizure classification using gradient tree boosting classifier. In: Proceedings of the 9th international conference on biomedical engineering and technology 2019.10.1145/3326172.3326182Search in Google Scholar

55. Wang, X, Gong, G, Li, N. Automated recognition of epileptic EEG states using a combination of symlet wavelet processing, gradient boosting machine, and grid search optimizer. Sensors 2019;19:219. https://doi.org/10.3390/s19020219.Search in Google Scholar PubMed PubMed Central

56. Anuragi, A, Sisodia, DS, Pachori, RB. Epileptic-seizure classification using phase-space representation of FBSE-EWT based EEG sub-band signals and ensemble learners. Biomed Signal Process Control 2022:71. https://doi.org/10.1016/j.bspc.2021.103138.Search in Google Scholar

57. Zolghadr, Z, Amirhossein Batouli, S, Tehrani-Doost, M, Shafaghi, L, Hadjighassem, M, Majd, HA, et al.. High-dimension low-sample-size modeling by sparse functional connectivity states in subjects with attention deficit-hyperactivity disorder and healthy controls. Arch Neurosci 2023;10:e134329. https://doi.org/10.5812/ans-134329.Search in Google Scholar

58. Ji, Z, Sugi, T, Goto, S, Wang, X, Ikeda, A, Nagamine, T, et al.. An automatic spike detection system based on elimination of false positives using the large-area context in the scalp EEG. IEEE Trans Biomed Eng 2011;58:2478–88. https://doi.org/10.1109/TBME.2011.2157917.Search in Google Scholar PubMed

59. Asad, R, Altaf, S, Ahmad, S, Mahmoud, H, Huda, S, Iqbal, S. Machine learning-based hybrid ensemble model achieving precision education for online education amid the lockdown period of COVID-19 pandemic in Pakistan. Sustainability 2023;15:5431. https://doi.org/10.3390/su15065431.Search in Google Scholar

60. Kumari, S, Kumar, D, Mittal, M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int J Cognit Comput Eng 2021;2:40–6. https://doi.org/10.1016/j.ijcce.2021.01.001.Search in Google Scholar

Received: 2022-12-13

Accepted: 2023-10-09

Published Online: 2023-10-30

Published in Print: 2024-04-25

Articles in the same Issue

https://doi.org/10.1515/bmt-2023-0332

Keywords for this article

epilepsy; iEEG signal; HFO; spike; machine learning; random forest