A resnet-based universal method for speckle reduction in optical coherence tomography images

In this work we propose a ResNet-based universal method for speckle reduction in optical coherence tomography (OCT) images. The proposed model contains 3 main modules: Convolution-BN-ReLU, Branch and Residual module. Unlike traditional algorithms, th…

Authors: Cai Ning, Shi Fei, Hu Dianlin

A resnet-based universal method for speckle reduction in optical   coherence tomography images
A RESNET-BASED UN IVERSAL METHOD F OR SP ECKLE REDUC TION IN OP TICAL COHERENCE TOMOG RAP HY IMAG ES Ning Cai 1,3 , Fei Shi 4 , Dianlin Hu 1,3 , Yang Chen 1,2,3 1 Laboratory of Im age Science and Technology , School of Computer Scien ce and Engineering , Southeast University , Nanjing, Ch ina 2 Centre de Rechercheen Inf ormation Biom edicale Sino-Francais (L IA CR IBs), Rennes, France 3 The Key Laboratory of Com puter Netw ork and Inform ation Integration (South east Univ ersity ), Ministry of Education, Chin a 4 School of Electronics and Inform ation Engin eering, Soochow University , Ch ina ABSTRACT In this w ork we propose a ResNet-based universal method for speckle reduction in optical coherence tomography (OCT) imag es. The proposed m odel contain s 3 main modules: Convolution -BN-ReL U, Branch and Residual module. Unlike traditional algorithm s, the model can learn from training data instead of selecting parameters manu ally such as noise level. A pplication of this prop osed method to the OCT images shows a more than 22 dB signal-to-n oise ratio im provem ent in speckle noise reduction with minim al structure blurring. The proposed m eth od provides strong generalization ability and can process noisy other types of OCT images w ithout retraining. It outperforms other filtering meth ods in suppressing speckle noises and revealing su btle features. Index Terms— Optical coherence tomography , deep learning, speckle, residual net 1. INTRODUCTION Optical coherence tomography (OCT ) generates cro ss- sectional im aging of bio logical tissue in micron resolution [1]. Speckle n oise in OCT limits the visual effect in contrast and impair the clinical diagnosis. Finding an efficient and effectiv e speckle denoising algorithm is a m ajor concern for its theoretical meanin g and practice value. A num ber of imag e pr ocessing meth ods have been used to despeckle the OCT imag es, such as m edian filtering [2], w avelet- based filtering that employ s nonlinear thresholds [3 ], and anisotropic diffusi on filtering[4]-[6]. Generally, the outcome can suppress the speckles in areas that are h omogen eous. But the noisy images are often blurred or over smoothed resulting in losses in details in nonhom ogeneous areas such as structural edges or lines. Kafieh et al. pr oposed an OCT denoising method based on dictionary learning. T he atoms in the dictionary can represent the clear imag e and th e noise w ith less regu lar structures are removed [7]. But the duration of learning a dictionary is long. Recently, Zhang et al. proposed and tested residual learnin g of deep Convolutional Neural Network (CNN) for natural image denoising, but have not expanded to speckle noise in clinical OCT imag es [8]. It is a pity that [8] hav e not used the m odel of residual. The Residual module mak es the proposed model easier to optimize and decr eases the training duration. Secondly, the proposed ResNet- based denoising methods in OCT imag es can h andle denoisin g with unknow n noise level or unk now n noise type. Thirdly, the network can process input of arbitrary size and generate correspondingly -sized output w ith efficient learning. Last but not least, based on patch processing instead of entire image, the model can select representative patches that helps to decrease training loss and provides generalization ability for different types of OCT images. Compared to exis ting meth ods, the proposed meth od provides fast processing speed and has lots of room for time reduction. (a) (b) Fig.1 : (a) T rain stage of m odel (b)T est stage of m odel 2. M ETHOD w e empirically design an end-to-en d fashion network using some design strategy. In the traini ng stage, the whole model learns from the training data and map the noisy image to noise image by m inim izing the MSE. In test stage, the netw ork takes the n oisy im age as input, and outpu t the corresponding noise image. The denoised result is obtained by subtracting the noise image from the noisy one, as shown in Fig.1. A nu mber of applications hav e been designed w ith deep neural netw ork (DNN). For example, CNN has achiev ed unparalleled success in image classification as the convolution kernel in network can learn diverse features of imag es. Also non-linearity relation between outcome and input data is learned in optimization stage w hen the training data flows in models and parameters of convolution kernel are revised. In the following we introduce Residual module and pr ovide details on network architectures as well as the training process and data augm entation . 2.1 Residual module He et al. f irst proposed the residual m odule in 20 15 [9]. We use residual module in the pr oposed method, w hich involves a pairwise addition betw een input im ages x and F x w here F is a functional transform ation that contains twice convolutional operation. R esidual module is easier to optimize and helps to decrease training duration as it alleviates the degradation problem such as vanishin g gradients in model training. The prop osed method can gain performance from Residual m odule w ith in creased depth. 2.2 The ResNet-based denoising framework Let , k s CBN denotes Convolution -BatchNorm -ReLU m odule w ith k filters of size s s , , k s Res denotes a Residual module, as described in 2.1, and , k s Branch denotes a Branch module w hich is pairw ise addition between , ( ) k s CBN x and , , ( ( )) k s k s C C x w here , ( ) k s C x denotes convolution. The prop osed ResNet-based m odel can be written as: 64, 3 64, 3 64, 3 64, 3 64, 3 64, 3 64, 3 64, 3 CBN CBN Branch Res Res Res CBN CBN As show n in Fig.2, we construct 3 different convolution layer structures as follows to sum up their advantages. (i)Convolution- BatchNorm- ReLU m odule: filters are used to generate relatively low-level features rather than the input imag e itself. This m odule also appeared in last two layers. (ii) Branch module: filters are used to m ix diverse features from different branches with different depth. This also increases the width of netw ork, leading to im proved performance. (iii) Residual m odule: as described in 2.1. Two operations are employed to improve the performance: batch norm alization [1 0] and ReLU activation function [11] . T he former improves the training efficien cy via red ucing the statistical differences between training samples and the latter keeps the sparsity of convolution kernel via restrictin g the result of conv olution. Fig.2 : Model structure 2.3 Training The size of input and output is 128 128 and is kept fixed in the m odel. The depth of the pr oposed netw ork is 12, wh ich is enough for extracting advan ced features w hile easy to train. Conventionally , the num ber of convolution kernels in each layer should be more in the middle layers and less in the beginnin g and end layers, but we set this number to 64 in every layer constrained by actual mem ory resources. Let C I denotes the input clear image, and N I denotes noisy image, ( ) N C N I I denotes the noise to be removed and ( ) T x denotes the m apping trans form the network do es. We adopt the loss fu nction 2 , , , 1 1 1 ( ) ( ( ; ) ( )) N M i j N Ni j C i j i j l T I I I N M (1) to minim ize the MSE loss between the output of model ( ) N T I and the noise N . 2.4 Training Data Prepara tion For OCT imag e d enoising , there's no ground trut h image readily available. We u se Bscan averagin g to obtain im ages almost f ree of speckle noise and use th em as train ing data. M macu la-centered 3 -D OCT volum es are obtained from the same normal eye. One volum e is randomly p icked as the target image. For each Bscan in this volume, N nearb y Bscans from each of the rest M-1 volumes are registered to it using aff ine transform ation. From the 1 N M registered images, L images with the highest structural sim ilarity index (SSIM) [12] scores are selected and averaged together with the target Bscan. Hence, the target Bscans and the cor responding averaged results are used as the noisy and denoised images of training set. For this paper, w e set M=20, N= 7, and L=10. The Bscans are flattened w ith respect to the retina bottom, to reduce the size of ROI. 3. EXPERIMENTAL RESULTS The m ultiple OCT volum es used to generate training data are acquired w ith the Top con A tlantis DRI-1 SS-OCT scanner(Topcaon, Tokyo, J apan), w ith 992 512 256 (height×w idth ×Bscans) voxels covering a 2 6 6 mm macu la- centered area. Among the 256 Bscans with ground truth, 24 6 Bscans are randomly select ed as training and validation set. 10-fold cross validation is carr ied out to observe early stop time. The rest 10 Bscans form the test set, for which the performance indices, such as peak signal- to-noise ratio (PSNR) as structural sim ilarity index ( SSIM), can be calculated. To show the m ethod is universally applicable to other ty pes of retinal OCT imag es. We also test th e meth od on images acquired from different subjects, from other scanners (Top con OCT-1000 and Zeiss 4000 ) or other scannin g protocols (Top con Atlantis DRI-1 under wide-view mode, covering 2 9 6 mm w ith both macula and optical nerve head areas). Some of the images show retinal pathologies. Some overlapping sam ples are also used in method. Finally , we get 1995 9 imag es patches. We set th e num ber of training epochs as 100. Table I Comparison of performan ce indices Al gorithms Center Bscan Peripheral Bscan PSNR(dB) SSI M PSNR(dB) SSI M Noisy image 12.74 -- 12.53 -- Median Filter 31.70 0. 32 31.41 0.34 NL M 32.42 0. 34 32.11 0.31 BM3D 33.43 0.35 32 .80 0.39 Propose d 34 .83 0.52 33 .84 0.54 (a) Noisy im age (b) Median filter (c) NLM (d) BM3D (e) Pro posed (f) Clear imag e (g) Noisy im age (h) Median filter (i) NLM (j) B M3D (k) Propo sed (l) Clear image Fig.3 Denoising results u sing m edian f iler, NLM, BM3D and the proposed method. (a) -(f) from th e center Bscan, (g)- (l) the peripheral Bscan. The results of one center Bscan and one per ipheral Bscan from the test set are shown in Fig.3, compared with three other denoising algorithms: median filter, non- local mean s (NLM) [13], block-matchin g and 3D filtering (BM3D) [14]. The PSNR and SSIM indices are lis ted in T able I. The indices are calculated in the ROI that contains the w hole retina. In the methods for comparison, the parameters are empirically optimized to giv e highest PSNR. Computing the PSNR between noisy image and clear im age, the pr oposed meth od has improved PSNR 20 dB in average. In Fig.3a and Fig.3g the noisy images have layered structure and are degraded by strong speckle noise. Fig.3b and Fig.3h show that median filter has little effect on removin g strong speckle noise and a lot of structural inform ation is lost. Fig.3c and Fig.3i show that NLM algorithm almost removes all noise in background but also removes a lot of important structural details. Fig.3d and Fig.3j show that BM3 D keeps the retinal structural details relatively well but the edges are distorted a bit. Fig.3e and Fig.3k show that the prop osed method almost keeps all retinal structural details while remove noise well. Especially in Fig.3k, the subtle structures are kept and enhanced well, including the structu res inside the vitreous body (r ed arr ows) and the external lim itin g m embrane (y ellow arrow). The denoisin g results for other ty pes of retinal OCT imag es is show n in Fig. 4. The proposed meth od achieves sim ilar speckle red uction performance, showing its generalization ability or adaptability. Table II Comparison of average runn ing tim e Methods Median Filter NLM BM3D Proposed Average Time(s) 0.23 418.90 4.21 0.23 (a)Normal subject, Topcon Atlantis DRI-1, 992 512 (b)Wide-view, Topcon Atlantis DRI -1, 992 512 (c) Pathological subject, T opcon Atlantis DRI- 1, 992 512 (d) P athological subject, Zeiss OCT-4000, 1024 51 2 (e) Nor mal su bject, T opcon OCT-1000, 480 512 Fig.4 Denoising results of other testing im ages. C olum n 1 is noisy imag e and Colum n 2 is corresponding denoised imag e 4. CONCLUSIONS In this work, we proposed a ResNet-based un iversal method to denoise the retin al OCT imag es wh ich are affected by speckle noise. The follow ing conclu sion is reached. The follow ing conclusions are reached. First, the proposed method y ields a relevantly hig h PSNR and SSIM, indicating that the denoising results are close to the ground truth im age obtained by Bscan averagin g. Second, the proposed meth od also has good visual effect, allev iating the noise and keeping the structures at the same time. T hird, th e processing time of pr oposed m ethod is sh ort an d can almost meet the requirement of clinical diagnosis. Moreover, using more processors or m atrix operation optimization, the processing tim e can be further reduced to reach real-tim e performance. 5. RE FERENCES [1] J. M. Schmitt, S. H. Xiang, and K . M. Yung, “ Speckle in optical coherence tomography ,” J. Biomed. Opt . , vol. 4, no. 1, pp. 9 5–105, 1999. [2] Ritenour, E., Nelson, T., & Raff, U. (1984, March). Applications of th e median fil ter to digital radiographic images. In Acoustics, Speech, and Signal Processing , IEEE Internation al Co nference on ICASSP'84 . (Vol. 9, pp. 251 -254). IEEE. [3] Schmitt, J. M., Xiang, S. H., & Yun g, K. M. (1999, March ). Speckle in optical coherence tomography: an overv iew . In Saratov Fall Meeting' 98: Light Scat tering Technol ogies for Mech anics, Biomedici ne, and Materia l Science (Vol. 37 26, pp. 450-462). Internation al Society for Optics and Photoni cs . [4] Perona, P., & Malik, J. (199 0). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on pattern an alysis an d machine intelli gence , 12(7), 6 29-639. [5] D. C. Fernandez, H. M. Salinas, "Evaluation of a nonlinear diffusion process for segmentation and quan tification of lesions in optical cohere nce tomography images," Proc. SPIE , vol . 1834, 2005; [6] Fernandez, D. C. (2005). Del ineating fluid-filled reg ion boundaries in optical cohere nce tomography image s of the re tina. IEEE transacti ons on medical imag ing , 24(8), 9 29-945. [7] K afieh, R., Rabbani, H ., & Sele snick, I. (2015). T hree dimensional data-driven multi scale atomic represe ntation of optical coherence tomography . IEEE transactio ns on medica l imaging , 3 4(5), 1042 -1062. [8] Zhang, K., Zuo, W., Chen, Y., Meng, D., & Z hang, L. (201 7). Beyond a Gaussian Denoiser: Residua l Learning of Deep CNN for I mage Denoising. IEEE transa ctions on imag e processin g: a publica tion of the IEEE Signal Processing Soc iety , 26 (7), 3142. [9] Krizhev sky, A ., Sutskever, I ., & Hinton, G. E. (2012). I magenet classification with dee p convol utional neural netw orks. In Advances in neural in formation processing systems (pp. 1097-11 05). [10] Ioff e, S., & Szege dy, C. (2015, Jun e). Batch normalization: Accel erating deep network train ing by re ducing internal covariate shift. In Internation al Conf erence on Ma chine Lea rning (pp. 4 48-456). [11] Gl orot, X., Bordes, A., & Bengio, Y. (201 1, June). Deep sparse rectifier neural networks. In Procee dings of the Fourtee nth Internati onal Conference on Artifi cial Intell igence an d Stat istics (pp. 315 -323). [12] Wang, Z., Bovik, A. C., Sheikh , H. R., & Simoncelli, E. P. (200 4). I mage qu ality assessment: from error visibi lity to struct ural similarity. IEEE transactio ns on image p rocessing , 1 3(4), 600-61 2.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment