Faculty Articles

Deep convolutional neural network architecture design as a bi-level optimization problem

Hassen Louati, Institut Supérieur de Gestion de Tunis
Slim Bechikh, Kennesaw State University
Ali Louati, Prince Sattam Bin Abdulaziz University
Chih Cheng Hung, Kennesaw State University
Lamjed Ben Said, Institut Supérieur de Gestion de Tunis

Department

Computer Science

Document Type

Article

Publication Date

6-7-2021

Abstract

During the last decade, deep neural networks have shown a great performance in many machine learning tasks such as classification and clustering. One of the most successful networks is the CNN (Convolutional Neural Network), which has been applied in many application domains such as pattern recognition, medical diagnosis, and signal processing. Despite the very interesting performance of CNNs, their architecture design is still so far a major challenge for researchers and practitioners. Several works have been proposed in the literature with the aim to find optimized architectures such as ResNet and VGGNet. Unfortunately, most of these architectures are either manually defined by experts or automatically designed by greedy induction algorithms. Recent works suggest the use of Evolutionary Algorithms (EAs) thanks to their ability to escape locally-optimal architectures. Despite the fact that EAs have shown interesting performance, researchers in this direction have considered the design task as a single-level optimization problem; which represents the main research gap we tackle in this paper. The main contribution behind our work consists in the fact that CNN architecture design has a hierarchical nature and thus could be seen as a Bi-Level Optimization Problem (BLOP) where: (1) the upper level minimizes the network complexity defined by the number of blocks and the number of nodes per block; and (2) the lower level optimizes the convolution block ‘graphs’ topologies by maximizing the classification accuracy. Motivated by the originality of our observation with respect to the state of the art, we frame for the first time the CNN architecture design problem as a BLOP and then solve it using an adapted version of an existing efficient bi-level EA; through the definition of the solution encoding, the fitness function, and the variation operators at each level. The adapted EA is named BLOP-CNN and is assessed on the image classification task using the commonly employed CIFAR-10 and CIFAR-100 benchmark data sets. The analysis of our experimental results show the merits of our proposed method in providing the user with optimized architectures that outperform many recent and prominent architectures coming from the three different approaches, namely: manual design, reinforcement learning-based generation, and evolutionary optimization. Moreover, to show the applicability of our approach, we have conducted a case study on the detection of the COVID-19 using a set of benchmark chest X-ray and Computed Tomography (CT) images.

Journal Title

Neurocomputing

Journal ISSN

09252312

Volume

439

First Page

Last Page

Digital Object Identifier (DOI)

10.1016/j.neucom.2021.01.094

Link to Full Text

Find in your library

COinS

Faculty Articles

Deep convolutional neural network architecture design as a bi-level optimization problem

Department

Document Type

Publication Date

Abstract

Journal Title

Journal ISSN

Volume

First Page

Last Page

Digital Object Identifier (DOI)

Search

Authors

Browse

Useful Links

Faculty Articles

Deep convolutional neural network architecture design as a bi-level optimization problem

Authors

Department

Document Type

Publication Date

Abstract

Journal Title

Journal ISSN

Volume

First Page

Last Page

Digital Object Identifier (DOI)

Share

Search

Authors

Browse

Useful Links