Segmentation is a critical step in medical image analysis. Fully Convolutional Networks (FCNs) have emerged as powerful segmentation models achieving state-of-the-art results in various medical image datasets. Network architectures are usually designed manually for a specific segmentation task so applying them to other medical datasets requires extensive experience and time. Moreover, the segmentation requires handling large volumetric data that results in big and complex architectures. Recently, methods that automatically design neural networks for medical image segmentation have been presented; however, most approaches either do not fully consider volumetric information or do not optimize the size of the network. In this paper, we propose a novel self-adaptive 2D-3D ensemble of FCNs for medical image segmentation that incorporates volumetric information and optimizes both the model's performance and size. The model is composed of an ensemble of a 2D FCN that extracts intra-slice information, and a 3D FCN that exploits inter-slice information. The architectures of the 2D and 3D FCNs are automatically adapted to a medical image dataset using a multiobjective evolutionary based algorithm that minimizes both the segmentation error and number of parameters in the network. The proposed 2D-3D FCN ensemble was tested on the task of prostate segmentation on the image dataset from the PROMISE12 Grand Challenge. The resulting network is ranked in the top 10 submissions, surpassing the performance of other automatically-designed architectures while being considerably smaller in size.
Deep Dive into Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation.
Segmentation is a critical step in medical image analysis. Fully Convolutional Networks (FCNs) have emerged as powerful segmentation models achieving state-of-the-art results in various medical image datasets. Network architectures are usually designed manually for a specific segmentation task so applying them to other medical datasets requires extensive experience and time. Moreover, the segmentation requires handling large volumetric data that results in big and complex architectures. Recently, methods that automatically design neural networks for medical image segmentation have been presented; however, most approaches either do not fully consider volumetric information or do not optimize the size of the network. In this paper, we propose a novel self-adaptive 2D-3D ensemble of FCNs for medical image segmentation that incorporates volumetric information and optimizes both the model’s performance and size. The model is composed of an ensemble of a 2D FCN that extracts intra-slice info
Accurate segmentation is crucial to various medical tasks such as studying anatomical structures, measuring tissue volume, and assisting in treatment planning before radiation therapy [1]. Fully convolutional networks (FCNs) have been shown to provide state-of-the-art results for medical image segmentation. However, FCN architectures are normally designed manually for a specific medical segmentation task. Given the high complexity and depth of current architectures, manually adapting the architectures to a new dataset resembles a black-box optimization process that requires extensive experience, time, and computational resources.
Two main types of FCNs have been proposed for handling volumetric medical image data. The first models are 2D networks that segment images in 2D and then concatenate them to provide the 3D segmentation [2,3,4]. Although these methods are able to capture rich information in one plane, they do not fully exploit the spatial correlation along the z-axis. The second type of networks are 3D FCNs that replace 2D convolutions with 3D convolutions and directly process volumetric information [5,6,7]. Nevertheless, 3D FCNs need a substantial number of parameters to capture representative features, and require high computational time and GPU memory. Recent work has focused on hybrid 2D-3D FCNs to combine the strengths of 2D and 3D FCNs [8,9]. However, these 2D-3D architectures remain considerably big and comparable in size and memory consumption to other 3D FCNs.
To address these challenges, there has been an increasing focus on developing methods that automatically design neural networks through the application of optimization algorithms, also known as neural architecture search (NAS). NAS can be considered a subfield of auto machine learning (AutoML) and has a significant overlap with hyperparameter optimization and meta-learning [10]. The algorithms applied in NAS have used reinforcement learning [11,12,13], evolutionary algorithms [14,15,16], surrogate model-based optimization [17], and one-shot architecture search [18,19]. However, recent neural architecture search (NAS) methods have been proposed mainly for image classification and language modeling [10,20].
For medical image segmentation, the use of NAS has been limited. Isensee et al. [21] presented a self-adapting framework that uses a rule-based approach to determine the pre-processing operations and training parameters of a pool of U-Net architectures (2D U-Net, 3D U-Net and cascaded U-Net). In [22], Mortazi and Bagci proposed a policy gradient reinforcement learning based method to find the hyperparameters of a 2D densely connected encoder-decoder baseline CNN. In [23], Weng et al. proposed three types of primitive operation sets to construct down-sampling and up-sampling cells for a 2D U-Net backbone network where the cell configurations are updated using the DARTS [18] differential search strategy. In our previous work [24], we proposed an adaptive 2D U-Net inspired architecture called AdaResU-Net, which applies a multiobjective evolutionary based algorithm to search for the hyperparameters of a semi-fixed architecture. This resulted in adaptive models that maximize the segmentation accuracy and minimize the model’s size for 2D segmentation. In [25], Zhu et al. presented a differentiable NAS that selects between 2D, 3D or Pseudo-3D convolutions for each layer of a FCN architecture. Similarly, Kim et al. [26] proposed a 3D U-Net template architecture that finds the configuration for the encoder, reduction, decoder and expansion cells by applying a differentiable NAS algorithm. Previous approaches either design 2D networks that segment the 3D images in a slice-wise manner, which does not consider crucial volumetric information, or find 3D configurations that rely on predefined architecture templates of fixed depth that do not optimize the size of the network.
In this paper, we present a self-adaptive 2D-3D FCN ensemble for medical image segmentation that incorporates volumetric information and optimizes both the model’s performance and its size. The network is composed of a 2D FCN that extracts in-plane information and a 3D FCN that exploits volumetric information. Both FCN architectures are automatically fitted to a specific medical image dataset using a multiobjective evolutionary based algorithm that maximizes segmentation accuracy and minimizes the number of parameters in the network. In contrast to other methods for medical image segmentation, our model is self-adaptive by searching for the optimal hyperparameters and architecture, fully utilizing volumetric information, and minimizing the size of the network. The proposed 2D-3D FCN ensemble was tested in the task of prostate segmentation on the PROMISE12 Grand Challenge [27]. Our model is ranked within the top 10 submissions of the leaderboard surpassing the performance of automatically-designed architectures while being considerably smaller in size. Therefore, we demonst
…(Full text truncated)…
This content is AI-processed based on ArXiv data.