Multi-objective evolution of oblique decision trees for imbalanced data binary classification
Imbalanced data classification is one of the most challenging problems in data mining. In this kind of problems, we have two types of classes: the majority class and the minority one. The former has a relatively high number of instances while the latter contains a much less number of instances. As most traditional classifiers usually assume that data is evenly distributed for all classes, they may considerably fail in recognizing instances in the minority class due to the imbalance problem. Several interesting approaches have been proposed to handle the class imbalance issue in the literature and the Oblique Decision Tree (ODT) is one of them. Nevertheless, most standard ODT construction algorithms use a greedy search process; while only very few works have addressed this induction problem using an evolutionary approach and this is done without really considering the class imbalance issue. To cope with this limitation, we propose in this paper a multi-objective evolutionary approach to find optimized ODTs for imbalanced binary classification. Our approach, called ODT-Θ-NSGA-III (ODT-based-Θ-Nondominated Sorting Genetic Algorithm-III), is motivated by its abilities: (a) to escape local optima in the ODT search space and (b) to maximize simultaneously both Precision and Recall. Thanks to these two features, ODT-Θ-NSGA-III provides competitive and better results when compared to many state-of-the-art classification algorithms on commonly used imbalanced benchmark data sets.
Swarm and Evolutionary Computation
Digital Object Identifier (DOI)