Deep Radiomics for Brain Tumor Detection and Classification from Multi-Sequence MRI

Deep Radiomics f or Brain T umor Detection and Classiﬁcation fr om Multi-Sequence MRI Subhashis Banerjee 1,2, * , Sushmita Mitra 1 , Francesco Masulli 3 , and Stefano Rovetta 3 1 Indian Statistical Institute, Machine Intelligence Unit, K olkata, 700108, India 2 University of Calcutta, Department of Computer Science and Engineer ing, K olkata, 700106, India 3 University of Geno va, Dept of Inf or matics Bioengineering Robotics and Systems Engineering, Genoa, 16146, Italy * mail.sb88@gmail.com ABSTRA CT Glioma constitutes 80% of malignant primar y brain tumors in adults, and is usually classiﬁed as High Grade Glioma (HGG) and Low Gr ade Glioma (LGG). The LGG tumors are less aggressiv e, with slo wer growth r ate as compared to HGG, and are responsive to ther apy . T umor biopsy being challenging for br ain tumor patients, noninv asiv e imaging techniques like Magnetic Resonance Imaging (MRI) hav e been extensiv ely employ ed in diagnosing brain tumors. Therefore, de v elopment of automated systems for the detection and prediction of the grade of tumors based on MRI data becomes necessar y for assisting doctors in the framew ork of augmented intelligence. In this paper , we thoroughly inv estigate the power of Deep Conv olutional Neural Networks (ConvNets) f or classiﬁcation of brain tumors using multi-sequence MR images. We propose no vel Con vNet models, which are trained from scratch, on MRI patches, slices, and multi-planar volumetric slices. The suitability of transfer learning for the task is ne xt studied b y applying two e xisting ConvNets models (V GGNet and ResNet) trained on ImageNet dataset, through ﬁne-tuning of the last f e w lay ers. Leave-one-patient-out (LOPO) testing, and testing on the holdout dataset are used to e v aluate the perf or mance of the Con vNets. Results demonstrate that the proposed Con vNets achie ve better accuracy in all cases where the model is trained on the multi-planar volumetric dataset. Unlike conv entional models, it obtains a testing accuracy of 95% f or the low/high gr ade glioma classiﬁcation problem. A score of 97% is generated f or classiﬁcation of LGG with/without 1p/19q codeletion, without any additional effort towards e xtraction and selection of f eatures. W e study the proper ties of self-learned kernels/ ﬁlters in different la yers , through visualization of the intermediate la yer outputs. We also compare the results with that of state-of-the-ar t methods, demonstrating a maximum impro vement of 7% on the grading performance of ConvNets and 9% on the prediction of 1p/19q codeletion status. Introduction Magnetic Resonance Imaging (MRI) has become the standard non-in v asi ve technique for brain tumor diagnosis o ver the last few decades, due to its improved soft tissue contrast 1 , 2 . Gliomas constitute 80% of all malignant brain tumors originating from the glial cells in the central nervous system. Based on the aggressiveness and inﬁltrativ e nature of the gliomas the W orld Health Organization (WHO) broadly classiﬁed them into two cate gories, viz. Low-grade gliomas (LGG), consisting of low-grade and intermediate-grade gliomas (WHO grades II and III), and high-grade gliomas (HGG) or glioblastoma (WHO grade IV) 3 . Diffuse LGG are inﬁltrati ve brain neoplasms which include histological classes astrocytomas, oligodendrogliomas, and oligoastrocytomas and W orld Health Organization (WHO) grade II and III neoplasms 3 . Although LGG patients have better surviv al than those with HGG, the LGGs are found to typically progress to secondary GBMs and eventual death 4 . In both cases a correct treatment planning (including surgery , radiotherapy , and chemotherap y separately or in combination) becomes necessary , considering that an early and proper detection of the tumor grade can result in good prognosis 5 . Histological grading, based on stereotactic/surgical biopsy test, is primarily used for the management of gliomas. T ypically the highest grade component, among the histopathology samples obtained, is used to predict the ov erall tumor grade. Gliomas being heterogeneous, sometimes histopathology samples collected from different parts of the same tumor exhibit different grades. Since pathologists are not pro vided with the entire delineated tumor during e xamination, it is likely that the highest grade component may be missing in the biopsy sample. This is called the biopsy sampling error 6 – 8 , and can potentially result in wrong clinical management of the disease. Moreov er there exist sev eral risk factors in the biopsy test, including bleeding from the tumor and brain due to the biopsy needle; causing se vere migraine, stroke, coma and e ven death. Other associated risks in volv e infection or seizures 9 , 10 . MR imaging, on the other hand, has the adv antage of being able to scan the entire tumor in vi vo and can demonstrate a strong correlation with histological grade. It is also not susceptible to sampling error , and inter- and intra-observer v ariability . In this context multi-sequence MRI plays a major role in the detection, diagnosis, and management of brain cancers in a non-in v asiv e 1 manner . Recent literature reports that computerized detection and diagnosis of the disease, based on medical image analysis, could be a good alternati ve. Decoding of tumor phenotype using nonin v asiv e imaging is a recent ﬁeld of research, kno wn as Radiomics 11 – 13 , and in volv es the extraction of a large number of quantitati ve imaging features that may not be apparent to the human eye. An integral part of the procedure in v olves manual or automated delineation of the 2D re gion of interest (R OI) or 3D volume of interest (V OI) 14 – 17 , to focus attention on the malignant gro wth. This is typically followed by the e xtraction of suitable sets of hand-crafted quantitative imaging features from the R OI or V OI, to be subsequently analyzed through machine learning to wards decision-making. Feature selection enables the elimination of redundant and/or less important subset(s) of features, for improv ement in speed and accuracy of performance. This is particularly relev ant for high-dimensional radiomic features, extracted from medical images. Quantitativ e imaging features, extracted from MR images, ha ve been in vestig ated in literature for the assessment of brain tumors 13 , 18 . Ref. 19 presents an adaptiv e neuro-fuzzy classiﬁer, based on linguistic hedges (ANFC-LH), for predicting the brain tumor grade using 56 3D quantitativ e MRI features extracted from the corresponding segmented tumor volume(s). Quantitativ e imaging features, extracted from pre-operativ e gadolinium-enhanced T1-weighted MRI, were inv estigated for the diagnosis of meningioma grades 20 . A study of MR imaging features was made 21 to determine those which can differentiate among grades of soft-tissue sarcoma . The features inv estigated include signal intensity , heterogeneity , margin, descriptiv e statistics, and perilesional characteristics on images, obtained from each MR sequence. Brain tumor classiﬁcation and grading, based on 2D quantitativ e imaging features like te xture and shape (in volving gray-le vel co-occurrence, run-length, and morphology), were also reported 22 . Although the techniques demonstrate good disease classiﬁcation, their dependence on hand-crafted features requires extensi ve domain kno wledge, in v olves human bias, and is problem-speciﬁc. Manual designing of features typically requires greater insight into the exact characteristics of normal and abnormal tissues, and may fail to accurately capture some important representativ e features; thereby hampering classiﬁer performance. The generalization capability of such classiﬁers may also suffer due to the discriminati ve nature of the methods, with the hand-crafted features being usually designed ov er ﬁxed training sets. Subsequently manual or semi-automatic localization and segmentation of the R OI or V OI is also needed to extract the quantitativ e imaging features 14 , 15 . Con volutional Neural Networks (Con vNets) offer state-of-the-art frame work for image recognition or classiﬁcation 23 – 25 . Con vNet architecture is designed to loosely mimic the fundamental w orking of the mammalian visual corte x system. It has been shown that the visual cortex has multiple layers of abstractions which look for speciﬁc patterns in the input vision. A Con vNet is built upon a similar idea of stacking multiple layers to allo w it to learn multiple different abstractions of the input data. These networks automatically learn mid-le vel and high-le vel representations or abstractions from the input training data, in the form of con v olution ﬁlters that are updated during the training process. They w ork directly on raw input (image) data, and learn the underlying representati ve features of the input which are hierarchically complex, thereby ruling out the need for specialized hand-crafted image features. Moreover Con vNets require no prior domain knowledge and can automatically learn to perform any task just by working through the training data. Howe ver training a Con vNet from scratch is generally difﬁcult because it essentially requires large training data, along with the signiﬁcant expertise to select an appropriate model architecture for proper con ver gence. In medical applications data is typically scarce, and expert annotation is expensi ve. T raining a deep ConvNet requires huge computational and memory resources, thereby making it extremely time-consuming. Repetitive adjustments in architecture and/or learning parameters, while av oiding ov erﬁtting, make deep learning from scratch a tedious, time-consuming, and exhausti ve procedure. Transfer learning offers a promising alternati ve, in case of inadequate data, to ﬁne tune a Con vNet pre-trained on a large set of av ailable labeled images from some other category 26 . This helps in speeding up con ver gence, while lowering computational complexity during training 27 , 28 . The adoption rate of Con vNets in medical imaging has been on the rise 29 . Howe ver giv en the insuf ﬁciency of medical image data, it often becomes dif ﬁcult to use deeper and more complex netw orks. Application of Con vNets in gliomas hav e been mostly reported for the segmentation of abnormal regions from 2D or 3D MRIs 30 – 35 . Automated detection and extraction of High Grade Gliomas (HGG) was performed using Con vNets 36 . The two-stage approach ﬁrst identiﬁed the presence of HGG, followed by a bounding box based tumor localization in each “abnormal” MR slice. As part of the Computer-Aided Detection system, Classiﬁcation and Detection Con vNet architectures were employed. Experimental results demonstrated that the CADe system, when used as a preliminary step before segmentation, can allow impro ved delineation of tumor region while reducing false positi ves arising in normal areas of the brain. Recently Y ang et al. 37 explored the role of deep learning and transfer learning for accurate grading of gliomas, using con ventional and functional MRIs. They used a pri v ate Chinese hospital database containing 113 pathologically conﬁrmed glioma patients, of which there were 52 LGG and 61 HGG samples. The AlexNet and GoogLeNet were trained from scratch, and ﬁne-tuned from models that had been pre-trained on the lar ge natural image database ImageNet. T esting on the 20% heldout data, randomly selected at patient-le vel, resulted in maximum test accuracy ( 90% ) by GoogLeNet. Radiomics has also been employed 38 for grading of gliomas into LGG and HGG, with the 2/ 15 MICCAI BraTs 2017 dataset 39 being used for training and testing of models. In addition to tumor grading, the prediction of 1p/19q codeletion status serves as a crucial molecular biomarker to wards prognosis in LGG. It is found to be related to longer surviv al, particularly for oligodendrogliomas which are more sensitive to chemotherapy . Such nonin vasi ve prediction through MRI can, therefore, lead to a voiding in v asiv e biopsy or sur gical procedures. Predicting 1p/19q status in LGG from MR images using Con vNet was reported 40 . The network was ﬁrst trained on a brain tumor patient database from the Mayo Clinic, containing a total of 159 LGG cases ( 57 non-deleted and 102 codeleted) and having preoperati ve postcontrast- T 1 and T 2 images 41 . The model was also trained and tested on 477 2D MRI slices extracted from the 159 patients, with 387 slices being used for training and 90 slices ( 45 non-deleted and 45 codeleted) during testing. A test accuracy of 87 . 7% w as obtained. Studies by the TCGA have established that LGGs can be grouped into three robust molecular classes on the basis of I DH 1 / 2 mutations and 1 p / 19 q co-deletion. The variants hav e been reported to differ with respect to tumor margins and internal homogeneity . The T 2 , F LAI R mismatch sign was found to be associated with a survi v al proﬁle similar to that of the I DH -mutant 1 p / 19 q -non-codeleted glioma subtype 42 , and more fav orable to that of the I DH -wild type gliomas (which present outcome similar to WHO grade IV glioblastomas). In this paper we exhausti vely in vestigate the behaviour and performance of Con vNets, with and without transfer learning, for nonin vasi ve studies of gliomas, in v olving (A) detection and grade prediction (into low- (Grades II and III) and high- (Grade IV) brain tumors (LGG and HGG), and (B) classiﬁcation of LGG with/without 1p/19q codeletion, from multi-sequence MRI. T umors are typically heterogeneous, depending on cancer subtypes, and contain a mixture of structural and patch-level variability . Prediction of the grade of a tumor may thus be based on either the image patch containing the tumor, or the 2D MRI slice containing the image of the whole brain including the tumor , or the 3D MRI volume encompassing the full image of the head enclosing the tumor . While in the ﬁrst case only the tumor patch is necessary as input, the other two cases require the Con vNet to learn to localize the R OI (or V OI) follo wed by its classiﬁcation. Therefore, the ﬁrst case needs only classiﬁcation while the other two cases additionally require detection or localization. Since the performance and complexity of Con vNets depend on the difﬁculty lev el of the problem and the type of input data representation, we introduce three kinds viz. i) Patch-based, ii) Slice-based, and iii) V olume-based data, from the original MRI dataset by introducing the sliding window concept, and employ these ov er the two e xperiments. Three Con vNet models are de veloped corresponding to each case, and trained from scratch. W e also compare two state-of-the-art Con vNet architectures, viz. VGGNet 43 and ResNet 23 , with parameters pre-trained on ImageNet using transfer learning (via ﬁne-tuning). The main contributions of this research are listed belo w . • Adaptation of deep learning to Radiomics, for the non-inv asi ve prediction of tumor grade followed determination of the 1p/19q status in Low-Grade Gliomas, from multi-sequence MR images of the brain. • Prediction of the grade of brain tumor without manual segmentation of tumor volume, or manual extraction and/or selection of features. • Conceptualization of “ Augmented intelligence”, with the application of deep learning for assisting doctors and radiologists tow ards decision-making while minimizing human bias and errors. • Dev elopment of novel Con vNet architectures viz. PatchNet, SliceNet, and V olumeNet for tumor detection and grade prediction, based on MRI patches, MRI slices, and multi-planar volumetric MR images, respecti vely . • New framework for applying existing pre-trained deep Con vNets models on multi-channel MRI data using transfer learning. The technique can be further extended to tasks of localization and/or segmentation on dif ferent MRI data. Results The Con vNet models were de veloped using T ensorFlo w , with K eras in Python. The experiments were performed on the Intel AI De vCloud platform, having a cluster of Intel Xeon Scalable processors. The quantitati ve and qualitati ve e valuation of the results are elaborated below . Quantitative ev aluation W e use (i) leav e-one-patient-out (LOPO), and (ii) holdout (or independent) test dataset for model v alidation. While only one sample is used for testing in the LOPO scheme, at each iteration, the remaining are employed for training the Con vNets. The process iterated ov er each patient. Although LOPO test scheme is computationally expensi ve, it allo ws av ailability of more data as required for Con vNets training. LOPO testing is rob ust and well-suited to our application, with results being generated for each indi vidual patient. Therefore, in cases of misclassiﬁcation, a patient sample may be further in vestigated. In holdout or 3/ 15 20 40 60 80 100 # E poch s 0.0 0.2 0.4 0.6 0.8 1.0 F1-Sc ore F1-Sc ore o n the val idati on set Patch Net Slic eNet Volu meNet VGGNet ResNet 20 40 60 80 100 # E poch s 0.0 0.2 0.4 0.6 0.8 1.0 Loss / Accura cy T raining accu rac y and loss 20 40 60 80 100 # E poch s 0.0 0.2 0.4 0.6 0.8 1.0 Loss / Accura cy Vali datio n ac cura c y a nd l oss Figure 1. Comparative performance of the netw orks. independent testing scheme either a portion of the training data that has nev er been used for training or a separate test dataset is used during model validation. T raining and validation performance of the three Con vNets were measured using the following two metrics. Accuracy = T P + T N T P + F P + T N + F N , F 1 Score = 2 × precision × recal l precision + recal l . Accuracy is the most intuiti ve performance measure and provides the ratio of correctly predicted observ ations to the total observations. F 1 Score is the weighted average of Precision and Recal l , which are deﬁned as T P T P + F P and T P T P + F N , with T P , T N , F P , and F N indicating the numbers of true positiv e, true negati ve, false positi ve and f alse negati ve detections. In the presence of imbalanced data one typically prefers F 1 Score ov er Accuracy because the former considers both false positi ves and false negati ves during computation. Case study-A: Classiﬁcation of lo w/high grade gliomas The dataset preparation schemes, discussed in Section Dataset pr eparation , were used to create the three separate training and testing data sets. The Con vNet models PatchNet, SliceNet, V olumeNet, were trained on the corresponding datasets using Stochastic Gradient Descent (SGD) optimization algorithm with learning rate = 0 . 001 and momentum = 0 . 9 , using mini-batches of size 32 samples generated from the corresponding training dataset. A small part of the training set ( 20% ) was used for validating the Con vNet model after each training epoch, for parameter selection and detection of overﬁtting. Since deep Con vNets entail a large number of free trainable parameters, the effecti ve number of training samples were artiﬁcially enhanced using real-time data augmentation – through some linear transformation such as random rotation ( 0 o − 10 o ), horizontal and vertical shifts, horizontal and vertical ﬂips. Also we used Dropout layer with a dropout rate of 0 . 5 in the fully connected layers and Batch-Normalization to control the o verﬁtting. After each epoch, the model was v alidated on the corresponding validation dataset. T raining and validation Accuracy and loss, and F 1 Score on the validation dataset, are presented in Fig. 1 for the three proposed ConvNets (PatchNet, SliceNet, and V olumeNet), trained from scratch, along with that for the tw o pre-trained Con vNets (VGGNet, and ResNet) ﬁne-tuned on the TCGA-GBM and TCGA-LGG datasets. The plots demonstrate that V olumeNet giv es the highest classiﬁcation performance during training, reaching maximum accuracy on the training set ( 100% ) and the validation set ( 98% ) within just 20 epochs. Although the performance of PatchNet and SliceNet is quite similar on the validation set (P atchNet - 90% , SliceNet - 92% ), it is observed that SliceNet achiev es better accuracy ( 94% ) on the training set. The performance of two the pre-trained models (VGGNet and ResNet) exhibit similar results, with both achieving around 85% accuracy on the v alidation set. All the networks reached a plateau after the 50th epoch. This establishes the superiority of the 3D volumetric le vel processing of V olumeNet. LOPO testing r esults : After training, the networks were e v aluated on the holdout test set emplo ying majority voting. Each patch or slice from the test dataset was from a single test patient in the LOPO frame work, and was cate gorized as HGG or LGG. The class with maximum number of slices or patches correctly classiﬁed was indicativ e of the grade of the tumor . In case of equal votes the patient w as marked as “ambiguous”. The LOPO testing scores are displayed in T able. 1 . V olumeNet is observed to achiev e the best LOPO test accuracy ( 97 . 19% ), with zero “ambiguous” cases as compared to the other four networks. SliceNet is also found to provide good LOPO test accuracy ( 90 . 18% ). Both the pre-trained models show similar LOPO test accuracy as P atchNet. This is interesting because it demonstrates that with a little ﬁne-tuning one can achie ve a test accurac y similar to that by the patch-lev el Con vNet trained 4/ 15 T able 1. Comparativ e LOPO test performance Con vNets Classiﬁed Misclassiﬁed Ambiguous Accuracy PatchNet 242 39 4 84.91 % SliceNet 257 26 2 90.18 % V olumeNet 277 8 0 97.19 % VGGNet 239 40 6 83.86 % ResNet 242 42 1 84.91 % T able 2. Comparativ e accuracy of deep and shallo w classiﬁers. Classiﬁer Accuracy (%) Details PatchNet 84 . 91 Trained and tested on MRI patches of size 32 × 32, having 3 (3 × 3) con volution (8, 16, 32 ﬁlters), and single FC (16 neurons) layers, with 50 epochs. SliceNet 90 . 18 Trained and tested on MRI slices of size 200 × 200, having 4 (3 × 3) con volution (16, 32, 64, 128 ﬁlters), and single FC (64 neurons) layers, with 50 epochs. V olumeNet 97.19 Trained and tested on multi-planar MRI slices of size 200 × 200 having three parallel Con vNets each with 3 (3 × 3) con volution (8, 16, 32 ﬁlters), and single FC (32 neurons) layers, with 10 epochs. VGGNet 83.86 Trained on ImageNet dataset, ﬁne-tuned and tested on MRI slices of size 200 × 200 ResNet 84.91 Trained on ImageNet dataset, ﬁne tuned and tested on MRI slices of size 200 × 200. ANFC-LH 85.83 Trained on manually e xtracted 23 quantitative MRI features, based on 10 fuzzy rules. NB 69.48 Trained on manually e xtracted 23 quantitative MRI features. LR 72.07 Trained on manually e xtracted 23 quantitative MRI features based on multinomial logistic re gression model with a ridge estimator . MLP 78.57 Trained on manually e xtracted 23 quantitative MRI features using single hidden layer with 23 neurons, learning rate = 0 . 1, momentum = 0 . 8. SVM 64.94 Trained on manually e xtracted 23 quantitative MRI features, LibSVM with RBF k ernel, cost = 1, gamma = 0. CAR T 70.78 Trained on manually e xtracted 23 quantitative MRI features using minimal cost-comple xity pruning. k -NN 73.81 Trained on manually e xtracted 23 quantitative MRI features, accurac y averaged o ver scores for k = 3 , 5 , 7. from scratch on a speciﬁc dataset. Therefore ﬁne-tuning of a fe w more intermediate layers can lead to very high test scores with little training. T able 2 compares the proposed ConvNets with e xisting shallow learning models in literature, used for the same application but requiring additional feature extraction and/or selection from manually segmented ROI/V OI, in terms of classiﬁcation accuracy . Ref. 19 reports the performance achieved by seven standard classiﬁers, viz. i) Adaptive Neuro-Fuzzy Classiﬁer (ANFC), ii) Nai ve Bayes (NB), iii) Logistic Regression (LR), i v) Multilayer Perceptron (MLP), v) Support V ector Machine (SVM), vi) Classiﬁcation and Regression Tree (CAR T), and vii) k -nearest neighbors ( k -NN), on the BraTS 2015 dataset (a subset of TCGA-GBM and TCGA-LGG datasets) consisting of 200 HGG and 54 LGG patient cases, each having 56 three-dimensional quantitative MRI features manually extracted. On the other hand, the ConvNets lev erage the learning capability of deep networks for automatically extracting rele v ant features from the data. T esting results on holdout dataset : Trained networks were also tested on an independent test dataset (MICCAI BraTS 2017 database) as discussed in Section Brain tumor data . The confusion matrix for each of the networks are sho wn in Fig. 2 . V olumeNet performs the best on the holdout set (accurac y = 95 . 00% ). The other two models P atchNet and SliceNet, which were trained from scratch, also demonstrate good classiﬁcation performance. On the other hand, the ﬁne-tuned models VGGNet and ResNet perform poorly on the independent test dataset. Comparison is made with two recently reported methods 37 , 38 , used for the same problem on same or different datasets. The comparison results are gi ven in T able 3 . It is observed that V olumeNet performed the best on the holdout dataset, achieving 7% improv ement in the accuracy as compared to state-of-the-art method 38 using the same dataset. Note that cross-validation was used in the compared models 37 , 38 for ev aluating performance, with training done on a part of the same data on whose holdout portion testing w as ev aluated. On the other hand, we trained our models on a dif ferent dataset (from TCIA) and tested it on the BraTs dataset. This v alidates the robustness of our models as compared to existing methods. T able 3. Comparativ e test performance with state-of-the-art methods on holdout dataset. Model Accuracy Dataset T ype Proposed models trained from scratch PatchNet 82% BraTs 2017 Deep learning based SliceNet 86% V olumeNet 95% Fine-tuned models VGGNet 68% ResNet 72% Y ang et al. 37 AlexNet 85% Priv ate dataset GoogLeNet 90% Cho et al. 38 88% Brats 2017 Radiomics based 5/ 15 HGG LGG Predicted label Accuracy=0.82; Misclass=0.18 HGG LGG True label 179 31 19 56 PatchNet HGG LGG Predicted label Accuracy=0.86; Misclass=0.14 HGG LGG True label 185 25 15 60 SliceNet HGG LGG Predicted label Accuracy=0.95; Misclass=0.05 HGG LGG True label 198 12 3 72 V olumeNet HGG LGG Predicted label Accuracy=0.72; Misclass=0.28 HGG LGG True label 150 60 19 49 25 50 75 100 125 150 HGG LGG Predicted label Accuracy=0.68; Misclass=0.32 HGG LGG True label 142 68 23 52 VGGNet ResNet Figure 2. Confusion matrix for classiﬁcation performance of the ﬁve models on MICCAI BraTs 2017 dataset. 20 40 60 80 100 # E poch s 0.0 0.2 0.4 0.6 0.8 1.0 Loss / Accura cy T raining accu rac y and loss T raining accu rac y and loss Vali datio n ac cura c y a nd l oss Figure 3. T raning performance of V olumeNet for the classiﬁcation of LGG with/without 1p/19q codeletion. V olumeNet Akkus et al. 40 Sensitivity 94% 93% Speciﬁcity 100% 82% Accuracy 97% 88% T able 4. Comparativ e test performance of V olumeNet on the holdout set. Case study-B: Classiﬁcation of LGG with/without 1p/19q codeletion W e trained the best performing model, i.e. V olumeNet, for the classiﬁcation of LGG with/without 1p19q codeletion. The Mayo Clinic database used for the task contains T 1 C and T 2 MRI sequences for each patient. V olumeNet was trained on the multi-planar volumetric database, preprocesed from the raw 3D brain volume images as described in Section Dataset preparation . T raining performance of the network, in terms of training and validation Accuracy & l oss , are presented in Fig. 3 . Comparison was made with a state-of-the-art method 40 , which is also based on deep learning and uses same dataset. The test performance on the holdout dataset is reported in T able 4 . Here again our model V olumeNet achiev es 9% more accuracy than the compared method, ov er the same dataset. The improvement is due to the incorporation of volumetric information through multi-planar MRI slices. Runtime analysis The total time required for training each network for 100 epochs is presented in T able 5 , av eraged o ver se veral runs. This further corroborates that multi-planar and slice le vel processing can learn to generalize better than a patch le vel netw ork, although at the expense of higher computational time. Qualitative ev aluation The Con vNets were next in vestigated through visual analysis of their intermediate layers. The performance of a ConvNet depends on the con v olution kernels, which are the feature e xtractors from the unsupervised learning process. V isualizing the output of any con volution layer can help determine the description of the learned kernels. Fig. 4 illustrates the intermediate con volution layer outputs (after ReLU acti vation) of the proposed SliceNet architecture on sample MRI slices from an HGG patient. The visualization of the ﬁrst con volution layer activ ations (or feature maps) [Fig. 4 (b)] indicates that the ConvNet has 6/ 15 T able 5. Comparativ e training time Con vNet Time ( M ean ± SD ) Training type PatchNet 10 . 75 ± 0 . 05 min from scratch SliceNet 65 . 95 ± 0 . 02 min from scratch V olumeNet 132 . 48 ± 0 . 05 min from scratch VGGNet 8 . 56 ± 0 . 03 min ﬁne-tuning ResNet 12 . 14 ± 0 . 03 min ﬁne-tuning (a) (b) (c) (d) and shape com ponents of the tumor (e) Figure 4. (a) Four sequences of an MRI slice from a sample HGG patient from TCIA 48 (under a Creativ e Commons Attribution 3.0 Unported. Full terms at https://creativ ecommons.org/licenses/by/3.0/). Intermediate layer outputs/feature maps, generated by SliceNet, at diferent lev els by (b) Con v1, (c) Con v2, (d) Conv3 and (e) Con v4. 7/ 15 learned a variety of ﬁlters to detect edges and distinguish between dif ferent brain tissues like white matter (WM), gray matter (GM), cerebrospinal ﬂuid (CSF), skull and background. Most importantly , some of the ﬁlters could isolate the ROI (or the tumor); on the basis of which the whole MRI slice may be classiﬁed. Most of the feature maps generated by the second con v olution layer [Fig. 4 (c)] mainly highlight the tumor region and its subregions; like enhancing tumor structures, surrounding c ystic/necrotic components and the edema region of the tumor . Thus the ﬁlters in the second con volution layer learn to e xtract deeper features from the tumor by focusing on the R OI (or tumor). The texture and shape of the tumor get enhanced in the feature maps generated from the third conv olution layer [Fig. 4 (d)]. For e xample, small, distributed, irregular tumor cells get enhanced (one of the most important tumor grading criteria called “CE-Heterogeneity” 44 ). Finally the last layer [Fig. 4 (e)] extracts detailed information about more discriminating features, by combining these to produce a clear distinction between images of different types of tumors. Discussion An exhausti ve study was made to demonstrate the effecti veness of Con v olutional Neural networks for non-inv asi ve, automated detection and grading of brain tumors from multi-sequence MR images. Three novel Con vNet architectures were dev eloped for distinguishing between HGG and LGG. Three lev el Con vNet architectures were designed to handle images at patch, slice and multi-planar modes. This was follo wed by exploring transfer learning for the same task, by ﬁne-tuning two e xisting Con vNet models. The scheme for incorporating volumetric tumor information, using multi-planar MRI slices, achiev ed the best test accuracy of 97 . 19% in the LOPO mode and 95% on the holdout dataset for the classiﬁcation of LGG/HGG (Case study A). In Case study B an accuracy of 97% was obtained for the classiﬁcation of LGG/HGG (case study A). In case study B an accuracy of 97 . 00% was obtained for the classiﬁcation of LGG with/without 1p/19q codeletion status on the holdout dataset. V isualization of the intermediate layer outputs/feature maps demonstrated the role of kernels/ﬁlters in the con volution layers in automatically learning to detect tumor features closely resembling different tumor grading criteria. It was also observed that existing Con vNets, trained on natural images, performed adequately just by ﬁne-tuning their ﬁnal conv olution layer on the MRI dataset. This in vestigation allo ws us to conclude that deep Con vNets could be a feasible alternative to sur gical biopsy for brain tumors. Diagnosis from histopathological images is also considered to be the “gold standard" in this domain. HGGs are characterized by the presence of pseudopallisading necrosis (necrotizing cell-dev oid region radially surrounded by lined-up tumor cells) and microv ascular proliferation (enlarged ne w blood vessels in the tissue) 45 . The LGGs, on the other hand, exhibit a visual smoothness with cells spread e venly throughout the tissue. Automated glioma grading, through image analysis of the slides, serves to complement the ef forts of clinicians in categorizing into low- (Grade II) and high- (Grade III, IV) gliomas (LGG and HGG). Methods In this section we provide a brief description of the data preparation at three lev els of resolution, followed by an introduction to con volutional neural netw orks and transfer learning. Brain tumor data For the classiﬁcation of low/high grade gliomas (Case study A), the models were trained on the TCGA-GBM 46 and TCGA- LGG 47 datasets do wnloaded from The Cancer Imaging Archi ve (TCIA) 48 . The testing was on an independent set, i.e. brain tumor dataset from MICCAI BraTS 2017 39 , 49 competition containing images of lo w grade glioma (LGG) and high grade glioma (HGG). The TCGA GBM and LGG datasets consists of 262 and 199 samples respecti vely , whereas the BraTs 2017 database contains 210 HGG and 75 LGG samples. Each patient scan has four sequences, encompassing the nati ve ( T 1 ), post-contrast enhanced T 1-weighted ( T 1 C ), T 2-weighted ( T 2), and T 2 Fluid-Attenuated In version Recov ery ( F LAI R ) volumes. The Con vNet model used for classiﬁcation of 1p/19q codeletion status in LGG (Case study B) was trained on the brain tumor patient database from Mayo Clinic, containing a total of 159 LGG patients ( 57 non-deleted and 102 codeleted) having preoperati ve postcontrast- T 1 and T 2 images do wnloaded from TCIA 41 . A total of 30 samples ( 15 non-deleted and 10 codeleted) were randomly selected from the data at the be ginning, as a test set, and w as nev er shown to the Con vNet during its training. The remaining 129 samples were used for training the model. Sample images of the two glioma grades (LGG/HGG), and LGG with and without 1p/19q codeletion are sho wn in Fig. 5 (a), (b), respecti vely . It can be observed from the ﬁgure that it is very hard to discriminate in each case, based only on the phenotypes visible to the human eye. Hence abstract features learned by the deep layers of a Con vNet are expected to be helpful in nonin v asiv ely differentiating between them. Besides, the use of large public domain datasets can allow more clinical impact as compared to controlled and dedicated prospectiv e image acquisitions. 8/ 15 T1 T1C T2 FLAI R HGG LGG (a) T1C T2 Nondeleted 1p/19q Codeletion 1p/19q (b) Figure 5. Sample MR image sequences from TCIA 48 (under a Creativ e Commons Attribution 3.0 Unported. Full terms at https://creativ ecommons.org/licenses/by/3.0/) of (a) lo w/high grade gliomas, and (b) low-grade glioma with and without 1p/19q codeletion. Figure 6. T en T2-MR patches extracted from contiguous slices from an LGG patient from TCIA 48 (under a Creativ e Commons Attribution 3.0 Unported. Full terms at https://creativ ecommons.org/licenses/by/3.0/). The datasets were aligned to the same anatomical template, skull-stripped, bias ﬁeld corrected and interpolated to 1 mm 3 vox el resolution. Dataset preparation Although the TCGA-GBM and TCGA-LGG datasets consist MRI volumes, we cannot propose a 3D Con vNet model for the classiﬁcation problem; mainly because the dataset has only 262 HGG and 199 LGG patients data, which is considered as inadequate to train a 3D ConvNet with a huge number of trainable parameters. Another problem with the dataset is its imbalanced class distribution i.e. about 35 . 72% of the data comes from the LGG class. Therefore we formulate 2D Con vNet models based on the MRI patches (encompassing the tumor region) and slices, follo wed by a multi-planar slice-based Con vNet model that incorporates the volumetric information as well. Applying Con vNet directly on the MRI slice could require extensi ve do wnsampling, thereby resulting in loss of discrimina- tiv e details. The tumor can be lying an ywhere in the image and can be of any size (scale) or shape. Therefore classifying the tumor grade from patches is easier , because the Con vNet learns to localize only within the e xtent of the tumor in the image. Thereby the Con vNet needs to learn only the rele v ant details without getting distracted by irrele vant details. Howe ver it may lack spatial and neighborhood details of the tumor , which may adv ersely inﬂuence grade prediction. Although classiﬁcation based on the 2D slices and patches often achiev es good accuracy , the incorporation of volumetric information from the dataset can enable the Con vNet to perform better . Along these lines, we propose schemes to prepare three different sets viz. (i) patch-based, (ii) slice-based, and (iii) multi-planar volumetric, from the TCIA datasets. P atch-based dataset The slice with the largest tumor region is ﬁrst identiﬁed. K eeping this slice in the middle, a set of slices before and after it are considered for e xtracting 2D patches containing the tumor re gions using a bounding-box. This bounding-box is marked, corresponding to each slice, based on the ground truth image. The enclosed image region is then extracted. W e use a set of 20 slices for extracting the patches. In case of MRI volumes from HGG (LGG) patients, four (ten) 2D patches [with a skip ov er 5 ( 2 ) slices] are extracted for each of the MR sequences. Therefore a total of 210 × 4 = 840 HGG and 75 × 10 = 750 LGG patches, with four channels each, constitute this dataset. Although the classes are still not perfectly balanced, this ratio is found to be good enough in the scenario of this enhanced dataset. 9/ 15 In spite of signiﬁcant dissimilarity visible between contiguous MRI slices at a global le vel, there may be little dif ference exhibited at the patch le vel. Therefore patches extracted from contiguous MRI slices look similar , particularly for LGG cases. Fig. 6 depicts a set of 10 patches extracted from contiguous MR slices of an LGG patient. This can lead to overﬁtting in the Con vNet. T o overcome this problem we introduce a concept of static augmentation by randomly changing the perfect bounding-box coordinates by a small amount ( ∈ {− 5 , 5 } pixels) before extracting the patch. This results in improv ed learning and con ver gence of the network. Slice-based dataset Complete 2D slices, with visible tumor region, are extracted from the MRI v olume. The slice with the lar gest tumor region, along with a set of 20 slices before and after it, are extracted from the MRI volume in a sequence similar to that of the patch-based approach. While for HGG patients 4 (with a skip over 5 ) slices are extracted, in the case of LGG patients 10 (with a skip of 2) slices are used. Multi-planar v olumetric dataset Here 2D MRI slices are extracted along all three anatomical planes, viz. axial ( X - Z axes), coronal ( Y - X axes), and sagittal ( Y - Z axes), in a manner similar to that described abo ve. Con v olutional neural networks Con volutional Neural Networks (Con vNets) can automatically learn low-le vel, mid-le vel and high-lev el abstractions from input training data in the form of con volution ﬁlter weights, that get updated during the training process by backpropagation. The inputs percolating through the network are the responses of conv oluting the images with various ﬁlters. These ﬁlters act as detectors of simple patterns like lines, edges, corners, from spatially contiguous regions in an image. When arranged in many layers, the ﬁlters can automatically detect prev alent patterns while blocking irrele vant re gions. Parameter sharing and sparsity of connection are the tw o main concepts that make Con vNets easier to train with a small number of weights as compared to dense fully connected layers. This reduces the chance of ov erﬁtting, and enables learning translation in variant features. Some of the important concepts, in the context of Con vNets are next discussed. Lay ers The fundamental constituents of a Con vNet consist of the input, con volution, acti vation, pooling and fully-connected layers. The input layer receiv es a multi-channel brain MRI patch/slice denoted by I ∈ R m × w × h , where m is the number of channels, w and h represent the resolution of the image. The con volutional layer takes the image or feature maps as input, and performs the conv olution operation between the input and each of the ﬁlters to generate a set of activation maps. The output feature map dimension, from a con volution layer , is calculated as w out / h out = ( w in / h in − F + 2 P ) S + 1, where w in and h in are the width and height of the input image, w out and h out are the width and height of the effecti ve output. Here P denotes the input padding, the stride S = 1 , and F is the kernel size of the neurons in a particular layer . Output responses of the conv olution and fully connected layers pass through some nonlinear activ ation function, such as a Rectiﬁed Linear Unit (ReLU) 50 , for transforming the data. ReLU, deﬁned as f ( a ) = max ( 0 , a ) , is a popular activ ation function for deep neural networks due to its computational efﬁcienc y and reduced likelihood of v anishing gradient. The pooling layer follo ws each con volution layer to typically reduce computational complexity by do wnsampling of the con voluted response maps. Max pooling enables selection of the maximum feature response in local neighborhoods, and thereby enhances translation inv ariance. The features learned through a series of con volutional and pooling layers are e ventually fed to a fully-connected layer , typically a Multilayer Perceptron. Additional layers like Batch-Normalization 51 reduce initial cov ariate shift. Dropout 52 is used as regularizer to learn a better representation of the data. Loss The cost function for the Con vNets is chosen as binary cross-entropy (for a two-class problem) as L C = − 1 n n ∑ i = 1 { y i log ( f i ) + ( 1 − y i ) log ( 1 − f i ) } , (1) where n is the number of samples, y i is the true label of a sample and f i is its predicted label. T ransfer learning T ypically the early layers of a Con vNet learn lo w-lev el image features, which are applicable to most vision tasks. The later layers, on the other hand, learn high-le vel features which are more application-speciﬁc. Therefore, shallow ﬁne-tuning of the last fe w layers is usually sufﬁcient for transfer learning. A common practice is to replace the last fully-connected layer of the pre-trained ConvNet with a new fully-connected layer , having as many neurons as the number of classes in the new target 10/ 15 application. The rest of the weights, in the remaining layers, of the pre-trained network are retained. This corresponds to training a linear classiﬁer with the features generated in the preceding layer . Howe ver , when the distance between the source and target applications is signiﬁcant then one may need to induce deeper ﬁne-tuning. This is equiv alent to training a shallo w neural network with one or more hidden layers. An effecti ve strategy 27 is to initiate ﬁne-tuning from the last layer , and then incrementally include deeper layers in the tuning process until the desired performance is achiev ed. Con vNets f or Brain tumor Grading This section introduces the three Con vNet architectures, trained on the three lev el brain tumor MR data sets, along with a brief description of the ﬁne tuning of existing models. Three level ar chitectures W e propose three Con vNet architectures, named PatchNet, SliceNet, and V olumeNet, which are trained from scratch on the three datasets prepared as detailed in Section Dataset preparation . This is follo wed by transfer learning and ﬁne-tuning of these networks. The Con vNet architectures are illustrated in Fig. 7 . PatchNet is trained on the patch-based dataset, and provides the probability of a patch belong to HGG or LGG. SliceNet gets trained on the slice-based dataset, and predicts the probability of a slice being from HGG or LGG. Finally V olumeNet is trained on the multi-planar volumetric dataset, and predicts the grade of a tumor from its 3D representation using the multi-planar 3D MRI data. As reported in literature 25 , smaller size conv olutional ﬁlters produce better regularization due to the smaller number of trainable weights; thereby allowing construction of deeper networks without losing too much information in the layers. W e use ﬁlters of size ( 3 × 3 ) for our Con vNet architectures. A greater number of ﬁlters, inv olving deeper con volution layers, allo ws for more feature maps to be generated. This compensates for the decrease in size of each feature map caused by “valid” con volution and pooling layers. Due to the complexity of the problem and bigger size of the input image, the SliceNet and V olumeNet architectures are deeper as compared to the PatchNet. Fine-tuning Pre-trained VGGNet (16 layers), and ResNet (50 layers) architectures, trained on the ImageNet dataset, are employed for transfer learning. Even though ResNet is deeper than VGGNet, the model size of ResNet is substantially smaller due to the usage of global a verage pooling rather than fully-connected layers. T ransferring from the non-medical to the medical image domain is achieved through ﬁne-tuning of the last conv olutional block of each model, along with the fully-connected layer (top-lev el classiﬁer). Fine-tuning of a trained network is achiev ed by retraining on the new dataset, while in volving very small weight updates. The adopted procedure is outlined below . • Instantiate the con volutional base of the model and load its pre-trained weights. • Replace the last fully-connected layer of the pre-trained Con vNet with a new fully-connected layer , ha ving single neuron with sigmoid activ ation. • Freeze the layers of the model up to the last con volutional block. • Finally retrain the last conv olution block and the fully-connected layers using Stochastic Gradient Descent (SGD) optimization algorithm with a very slo w learning rate. Since the base models were trained on RGB images, and accept single input with three channels, we train and test them on the slice-based dataset in volving three MR sequences ( T 1 C , T 2 , F L AI R ). The T 1 C sequence was found to perform better than T 1 , when used in conjunction with T 2 and F L AI R . Although running either of these two models from scratch is very expensi ve, particularly on CPU, the concept of ﬁne-tuning just the last fe w layers could be easily accomplished. References 1. DeAngelis, L. M. Brain tumors. New Engl. J. Medicine 344 , 114–123 (2001). 2. Cha, S. Update on brain tumor imaging: From anatomy to physiology . Am. J. Neur oradiol. 27 , 475–487 (2006). 3. Louis, D. N., Perry , A., Reifenberger , G. & et al. The 2016 World Health Organization classiﬁcation of tumors of the Central Nervous System: A summary . Acta Neur opathol. 131 , 803–820 (2016). 4. Li, Y ., W ang, D. & et al. Distinct genomics aberrations between lo w-grade and high-grade gliomas of Chinese patients. PLOS ONE https://doi.org/10.1371/journal.pone.0057168 (2013). 11/ 15 Pooling layer ( width × heig ht ) F ully-connected lay er ( number of neurons ) Sigmoid output neuron Concatenation layer Conv olution + Batch-Normalization + ReLU (# f il ters @ w idth × heig ht ) Conv olution + ReLU (# f il ter @ width × heig ht ) P atchNet 2 × 2 16@3 × 3 32@3 × 3 16 2 × 2 2 × 2 Input: 4 × 32 × 32 8@3 × 3 SliceNet 3 × 3 32@3 × 3 64@3 × 3 64 3 × 3 3 × 3 Input: 4 × 200 × 200 128@3 × 3 2 × 2 16@3 × 3 SliceNet V olumeNet Input: 3 × 4 × 200 × 200 3 × 3 3 × 3 3 × 3 8@3 × 3 8@3 × 3 3 × 3 3 × 3 3 × 3 3 × 3 3 × 3 3 × 3 8@3 × 3 16@3 × 3 16@3 × 3 16@3 × 3 32@3 × 3 32@3 × 3 32@3 × 3 32 32 32 32 (a) (b) (c) Figure 7. Three le vel Con vNet architectures (a) PatchNet, (b) SliceNet, and (c) V olumeNet, sample MRIs are from TCIA database 48 (under a Creativ e Commons Attribution 3.0 Unported. Full terms at https://creativecommons.or g/licenses/by/3.0/). 12/ 15 5. V an den Bent, M. J., Brandes, A. A. & et al. Adjuvant procarbazine, lomustine, and vincristine chemotherapy in ne wly diagnosed anaplastic oligodendroglioma: Long-term follow-up of EOR TC brain tumor group study 26951. J. Clin. Oncol. 31 , 344–350 (2012). 6. Chandrasoma, P . T ., Smith, M. M. & Apuzzo, M. L. J. Stereotactic biopsy in the diagnosis of brain masses: Comparison of results of biopsy and resected surgical specimen. Neur osur gery 24 , 160–165 (1989). 7. Glantz, M. J., Burger , P . C. & et al. Inﬂuence of the type of surgery on the histologic diagnosis in patients with anaplastic gliomas. Neurolo gy 41 , 1741–1741 (1991). 8. Jackson, R. J., Fuller , G. N. & et al. Limitations of stereotactic biopsy in the initial management of gliomas. Neuro-oncolo gy 3 , 193–200 (2001). 9. Field, M., W itham, T . F ., Flickinger , J. C., K ondziolka, D. & Lunsford, L. D. Comprehensiv e assessment of hemorrhage risks and outcomes after stereotactic brain biopsy . J. Neur osur g. 94 , 545–551 (2001). 10. McGirt, M. J., W oodworth, G. F . & et al. Independent predictors of morbidity after image-guided stereotactic brain biopsy: A risk assessment of 270 cases. J. Neurosur g. 102 , 897–901 (2005). 11. Mitra, S. & Uma Shankar , B. Medical image analysis for cancer management in natural computing frame work. Inf. Sci. 306 , 111–131 (2015). 12. Mitra, S. & Uma Shankar , B. Integrating radio imaging with gene expressions tow ard a personalized management of cancer . IEEE T ransactions on Human-Mac hine Syst. 44 , 664–677 (2014). 13. Gillies, R. J., Kinahan, P . E. & Hricak, H. Radiomics: Images are more than pictures, they are data. Radiology 278 , 563–577 (2015). 14. Banerjee, S., Mitra, S. & Uma Shankar , B. Single seed delineation of brain tumor using multi-thresholding. Inf. Sci. 330 , 88–103 (2016). 15. Banerjee, S., Mitra, S., Uma Shankar , B. & Hayashi, Y . A novel GBM saliency detection model using multi-channel MRI. PLOS ONE 11 , e0146388 (2016). 16. Banerjee, S., Mitra, S. & Uma Shankar, B. Automated 3D segmentation of brain tumor using visual saliency . Inf. Sci. 424 , 337–353 (2018). 17. Mitra, S., Banerjee, S. & Hayashi, Y . V olumetric brain tumour detection from MRI using visual saliency . PLOS ONE 12 , 1–14 (2017). 18. Zhou, M., Scott, J. & et al. Radiomics in brain tumor: Image assessment, quantitati ve feature descriptors, and machine- learning approaches. Am. J. Neuror adiol. 39 , 208–216 (2017). 19. Banerjee, S., Mitra, S. & Shankar , B. U. Synergetic neuro-fuzzy feature selection and classiﬁcation of brain tumors. In Pr oceedings of IEEE International Confer ence on Fuzzy Systems (FUZZ-IEEE) , 1–6 (2017). 20. Coroller , T ., Bi, W . & et al. Early grade classiﬁcation in meningioma patients combining radiomics and semantics data. Med. Phys. 43 , 3348–3349 (2016). 21. Zhao, F ., Ahlawat, S. & et al. Can MR imaging be used to predict tumor grade in soft-tissue sarcoma? Radiology 272 , 192–201 (2014). 22. Zacharaki, E. I., W ang, S. & et al. Classiﬁcation of brain tumor type and grade using MRI texture and shape in a machine learning scheme. Magn. Reson. Medicine 62 , 1609–1618 (2009). 23. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. In Pr oceedings of the IEEE Confer ence on Computer V ision and P attern Recognition , 770–778 (2016). 24. LeCun, Y ., Bengio, Y . & Hinton, G. Deep learning. Natur e 521 , 436–444 (2015). 25. Szegedy , C., Liu, W . & et al. Going deeper with conv olutions. In Pr oceedings of the IEEE Conference on Computer V ision and P attern Recognition , 1–9 (2015). 26. Oquab, M., Bottou, L., Lapte v , I. & Sivic, J. Learning and transferring mid-le vel image representations using con volutional neural networks. In Pr oceedings of IEEE Confer ence on Computer V ision and P attern Recognition , 1717–1724 (2014). 27. T ajbakhsh, N., Shin, J. Y ., & et al. Con v olutional neural networks for medical image analysis: Full training or ﬁne tuning? IEEE T ransactions on Med. Imaging 35 , 1299–1312 (2016). 28. Phan, H. T . H., Kumar , A., Kim, J. & Feng, D. Transfer learning of a con v olutional neural network for hep-2 cell image classiﬁcation. In Proceedings of IEEE 13th International Symposium on Biomedical Ima ging (ISBI) , 1208–1211 (2016). 13/ 15 29. Greenspan, H., van Ginnek en, B. & Summers, R. M. Deep learning in medical imaging: Overview and future promise of an exciting ne w technique. IEEE T ransactions on Med. Imaging 35 , 1153–1159 (2016). 30. Pereira, S., Pinto, A., Alves, V . & Silva, C. A. Brain tumor segmentation using conv olutional neural networks in MRI images. IEEE T ransactions on Med. Imaging 35 , 1240–1251 (2016). 31. Zikic, D., Ioannou, Y . & et al. Segmentation of brain tumor tissues with con v olutional neural networks. 36–39 (2014). 32. Urban, G., Bendszus, M., Hamprecht, F . A. & Kleesiek, J. Multi-modal brain tumor segmentation using deep con volutional neural networks. In Pr oc. of MICCAI-BRA TS (W inning Contribution) , 1–5 (2014). 33. Kamnitsas, K., Ledig, C. & et al. Efﬁcient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med. Image Analysis 36 , 61–78 (2017). 34. Hav aei, M., Davy , A. & et al. Brain tumor segmentation with deep neural networks. Med. Image Analysis 35 , 18–31 (2017). 35. L yksbor g, M., Puonti, O. & et al. An ensemble of 2D con volutional neural networks for tumor se gmentation. In Image Analysis , 201–211 (Springer , New Y ork, 2015). 36. Banerjee, S., Mitra, S., Sharma, A. & Uma Shankar , B. A CADe system for gliomas in brain MRI using con volutional neural networks. arXiv preprint 1806.07589 (2018). 37. Y ang, Y ., Y an, L.-F . & et al. Glioma grading on con ventional MR images: A deep learning study with transfer learning. F ront. Neur osci. 12 , DOI: 10.3389/fnins.2018.00804 (2018). 38. Cho, H.-H., Lee, S.-H., Kim, J. & Park, H. Classiﬁcation of the glioma grading using radiomics analysis. P eerJ 6 , e5982 (2018). 39. Bakas, S., Akbari, H. & et al. Advancing the cancer genome atlas glioma MRI collections with expert segmentation labels and radiomic features. Sci. Data 4 , 170117 (2017). 40. Akkus, Z., Ali, I. & et al. Predicting deletion of chromosomal arms 1p/19q in lo w-grade gliomas from mr images using machine intelligence. J. Digit. Imaging 30 , 469–476 (2017). 41. Erickson, B. & Akkus, Z. Data from LGG-1p19q deletion, DOI: 10.7937/K9/TCIA.2017.dwehtz9v (2017). The Cancer Imaging Archiv e. 42. Patel, S. H., Poisson, L. M., Brat, D. J. & et al. T 2 − F LAI R mismatch, an imaging biomarker for I DH and 1 p / 19 q status in lower grade gliomas: A TCGA/TCIA project. Am. Assoc. for Cancer Res. DOI:10.1158/1078–0432.CCR–17–0560 (2017). 43. Simonyan, K. & Zisserman, A. V ery deep conv olutional networks for large-scale image recognition. arXiv pr eprint 1409.1556 (2014). 44. Chekhun, V ., Sherban, S. & Savtsov a, Z. T umor cell heterogeneity . Exp. Oncol. 154–162 (2013). 45. Mousavi, H. S., Monga, V ., Rao, G. & Rao, A. U. K. Automated discrimination of lower and higher grade gliomas based on histopathological image analysis. J. P athol. Inform. http://www .jpathinformatics.org/te xt.asp?2015/6/1/15/153914 (2015). 46. Scarpace, L., Mikkelsen, T . & et al. Radiology data from The Cancer Genome Atlas Glioblastoma Multiforme [TCGA- GBM] collection. The Cancer Imaging Archive. http://doi.or g/10.7937/K9/TCIA.2016.RNYFUYE9. 47. Pedano, N., Flanders, A., Scarpace, L. & et al. Radiology data from The Cancer Genome Atlas Low Grade Glioma [TCGA-LGG] collection. Cancer Imaging Arc h (2016). 48. Clark, K., V endt, B., Smith, K. & et al. The Cancer Imaging Archiv e (TCIA): Maintaining and operating a public information repository . J. Digit. Imaging 26 , 1045–1057 (2013). 49. Menze, B. H. et al. The multimodal Brain Tumor image Segmentation benchmark (BraTS). IEEE T ransactions on Med. Imaging 34 , 1993–2024 (2015). 50. Glorot, X., Bordes, A. & Bengio, Y . Deep sparse rectiﬁer neural networks. In Pr oceedings of the F ourteenth International Confer ence on Artiﬁcial Intelligence and Statistics , 315–323 (2011). 51. Ioffe, S. & Szegedy , C. Batch normalization: Accelerating deep network training by reducing internal cov ariate shift. In Pr oceedings of International Confer ence on Machine Learning , 448–456 (2015). 52. Sriv asta va, N., Hinton, G. E., Krizhe vsky , A., Sutske ver , I. & Salakhutdinov , R. Dropout: A simple way to pre vent neural networks from ov erﬁtting. J. Mach. Learn. Res. 15 , 1929–1958 (2014). 14/ 15 Ackno wledgements This research is supported by the IEEE Computational Intelligence Society Graduate Student Research Grant 2017. S. Banerjee acknowledges the support provided to him by the Intel Corporation, through the Intel AI Student Ambassador Program. S. Mitra acknowledges the support pro vided to her by the Indian National Academy of Engineering, through the IN AE Chair Professorship. This publication is an outcome of the R&D work undertak en in a project with the V isvesv araya PhD Scheme of Ministry of Electronics & Information T echnology , Government of India, being implemented by Digital India Corporation. A uthor contributions statement S.B. concei ved the e xperiment(s), S.B. conducted the experiment(s), S.B. analysed the results. S.B, S.M, F .M, and S.R re vie wed the manuscript. Additional Inf ormation Competing interests The author(s) declare no competing interests. 15/ 15

Deep Radiomics for Brain Tumor Detection and Classification from Multi-Sequence MRI

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment