Date of Award
Doctor of Philosophy in Analytic and Data Science
Statistics and Analytical Sciences
Jing (Selena) He
Epileptic seizure or epilepsy is a chronic neurological disorder that occurs due to brain neurons' abnormal activities and has affected approximately 50 million people worldwide. Epilepsy can affect patients’ health and lead to life-threatening emergencies. Early detection of epilepsy is highly effective in avoiding seizures by intervening in treatment. The electroencephalogram (EEG) signal, which contains valuable information of electrical activity in the brain, is a standard neuroimaging tool used by clinicians to monitor and diagnose epilepsy. Visually inspecting the EEG signal is an expensive, tedious, and error-prone practice. Moreover, the result varies with different neurophysiologists for an identical reading. Thus, automatically classifying epilepsy into different epileptic states with a high accuracy rate is an urgent requirement and has long been investigated. This PhD thesis contributes to the epileptic seizure detection problem using Machine Learning (ML) techniques.
Machine learning algorithms have been implemented to automatically classifying epilepsy from EEG data. Imbalance class distribution problems and effective feature extraction from the EEG signals are the two major concerns towards effectively and efficiently applying machine learning algorithms for epilepsy classification. The algorithms exhibit biased results towards the majority class when classes are imbalanced, while effective feature extraction can improve classification performance.
In this thesis, we presented three different novel frameworks to effectively classify epileptic states while addressing the above issues. Firstly, a deep neural network-based framework exploring different sampling techniques was proposed where both traditional and state-of-the-art sampling techniques were experimented with and evaluated for their capability of improving the imbalance ratio and classification performance. Secondly, a novel integrated machine learning-based framework was proposed to effectively learn from EEG imbalanced data leveraging the Principal Component Analysis method to extract high- and low-variant principal components, which are empirically customized for the imbalanced data classification. This study showed that principal components associated with low variances can capture implicit patterns of the minority class of a dataset. Next, we proposed a novel framework to effectively classify epilepsy leveraging summary statistics analysis of window-based features of EEG signals. The framework first denoised the signals using power spectrum density analysis and replaced outliers with k-NN imputer. Next, window level features were extracted from statistical, temporal, and spectral domains. Basic summary statistics are then computed from the extracted features to feed into different machine learning classifiers. An optimal set of features are selected leveraging variance thresholding and dropping correlated features before feeding the features for classification.
Finally, we applied traditional machine learning classifiers such as Support Vector Machine, Decision Tree, Random Forest, and k-Nearest Neighbors along with Deep Neural Networks to classify epilepsy. We experimented the frameworks with a benchmark dataset through rigorous experimental settings and displayed the effectiveness of the proposed frameworks in terms of accuracy, precision, recall, and F-beta score.
Masum, Mohammad, "Integrated Machine Learning Approaches to Improve Classification performance and Feature Extraction Process for EEG Dataset" (2021). Analytics and Data Science Dissertations. 10.