Medical image segmentation based on multi-layer features and spatial information distillation
-
摘要:
U-Net在医学影像分割领域是目前应用最广泛的分割模型,其“编码-解码”结构也成为了构建医学影像分割模型最常用的结构。尽管U-Net在许多领域实现了非常高的分割准确度,但是存在着计算复杂度高、推理速度慢、运行消耗内存大等问题,导致其难以在移动应用平台部署。为解决这一问题,提出了一种结合多层特征及空间信息蒸馏的医学影像分割方法TinyUnet。该方法使用轻量化的U-Net作为学生网络。考虑到小模型没有足够的学习能力,通过选择合适的蒸馏位置,对多层教师特征图进行蒸馏; 同时加强教师网络深层特征图的边缘,并构建边缘关键点图结构,采用图卷积网络对学生网络进行空间信息蒸馏,从而补充重要的边缘信息和空间信息。实验表明:在3个医学影像数据集上,TinyUnet能够达到U-Net 98.3%~99.7%的分割准确度,但是将U-Net的参数量平均降低了99.6%,运算速度提高了约110倍; 同时,与其他轻量化医学影像分割模型相比,TinyUnet不仅具有较高的分割准确度,而且占用内存更少,运行速度更快。
Abstract:U-Net is currently the most widely used segmentation model, and its "coding-decoding" structure has also become the most commonly used structure for building medical image segmentation models. Although U-Net has achieved very high segmentation accuracy in many fields, but there are problems such as highcomputational complexity, slow reasoning speed, and high memory consumption, which makes it difficult to deploy on mobile application platforms. To solve this problem, a medical image segmentation method combining multi-layer features and spatial information distillation, named as TinyUnet, is proposedin this paper. This method uses the U-Net with fewer parameters as the student network, which is smaller and lighter than the original U-Net. Considering that the small model does not have enough learning ability, this method distils the multi-layer teacher feature maps by selecting the appropriate distillation position; at the same time, this method strengthens the edge of the deep feature map of the teacher network, constructs the edge key point map structure, and uses the graph convolution network to distil the spatial information of the student network, so as to guide the student network to obtain more effective edge information and spatial information. Experiments show that TinyUnet can maintain the segmentation accuracy of U-Net from 98.3% to 99.7% on the three medical datasets, but reduces the parameters of U-Net by 99.6% on average and increases the computing speed by about 110 times. Meanwhile, compared with other advanced compact medical image segmentation models, TinyUnet not only achieves good segmentation accuracy but also occupies less memory and runs faster.
-
表 1 U-Net结构
Table 1. Structure of U-Net
卷积块 结构 conv_block(in_c, N) MaxPooling→conv_block(N, N×2) MaxPooling→conv_block(N×2, N×4) MaxPooling→conv_block(N×4, N×8) MaxPooling→conv_block(N×8, N×8) UpSampling→conv_block(N×16, N×4) UpSampling→conv_block(N×8, N×2) UpSampling→conv_block(N×4, N) UpSampling→conv_block(N, out_c) 表 2 不同参数N的实验结果
Table 2. Experimental results of different parameters N
N Dice #Parameter/103 Size/MB GFLOPs 4 0.677 58.557 0.271 0.455 6 0.716 131.119 0.547 1.003 8 0.718 232.537 0.938 1.766 16 0.723 926.769 3.59 6.960 32 0.741 3 700 14.18 27.630 表 3 不同蒸馏位置的Dice值
Table 3. Dice score of different distillation locations
蒸馏位置 Dice 口腔全景片数据集 NIH数据集 EM数据集 Φ{0} 0.886 0.716 0.911 Φ{1} 0.897 0.717 0.924 Φ{5} 0.891 0.721 0.922 Φ{9} 0.900 0.724 0.924 Φ{1, 5, 9} 0.903 0.728 0.929 Φ{1, 3, 5, 7, 9} 0.911 0.726 0.923 Φ{1, 2, 3, 4, 5, 6, 7, 8, 9} 0.893 0.722 0.920 表 4 不同聚类个数K的Dice值
Table 4. Dice score of different cluster number K
K Dice 口腔全景片数据集 NIH数据集 EM数据集 0 0.903 0.728 0.929 4 0.903 0.730 0.931 8 0.904 0.744 0.932 16 0.909 0.744 0.930 32 0.911 0.745 0.930 表 5 不同方法在口腔全景片数据集上的结果
Table 5. Results of different methods on oral panoramic film dataset
方法 Dice #Parameter/106 Size/MB GFLOPs U-Net 0.914 34.5 56.6 110.5 Unet-fixed 0.67 4.84 9.23 117.5 LightUnet 0.906 0.066 7 0.473 4.633 EMKD 0.912 0.353 1.59 2.031 Unet-6 0.889 0.131 0.547 1.003 Unet-6-dis 0.903 0.131 0.547 1.003 TinyUnet 0.911 0.131 0.547 1.003 表 6 不同方法在NIH和EM数据集上的结果
Table 6. Results of different methods on NIH and EM datasets
方法 Dice #Parameter/106 Size/MB GFLOPs NIH数据集 EM数据集 U-Net 0.757 0.936 34.5 56.6 110.5 Unet-fixed 0.746 0.920 4.84 9.23 117.5 LightUnet 0.741 0.926 0.066 7 0.473 4.633 EMKD 0.748 0.932 0.353 1.59 2.031 Unet-6 0.716 0.928 0.131 0.547 1.003 Unet-6-dis 0.728 0.929 0.131 0.547 1.003 TinyUnet 0.744 0.932 0.131 0.547 1.003 -
[1] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutional networks for biomedical image segmentation[C]// Medical Image Computing and Computer-Assisted Intervention, 2015: 234-241. [2] ZHOU Z W, SIDDIQUEE M M R, TAJBAKHSH N, et al. UNet+ +: A nested U-Net architecture for medical image segmentation[EB/OL]. (2018-07-18)[2021-08-30]. https://arxiv.org/abs/1807.10165. [3] HUBARA I, COURBARIAUX M, SOUDRY D, et al. Binarized neural networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. New York ACM, 2016: 4114-4122. [4] 饶川, 陈靓影, 徐如意, 等. 一种基于动态量化编码的深度神经网络压缩方法[J]. 自动化学报, 2019, 45(10): 1960-1968. https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201910014.htmRAO C, CHEN J Y, XU R Y, et al. A dynamic quantization coding based deep neural network compression method[J]. Acta Automatica Sinica, 2019, 45(10): 1960-1968(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-MOTO201910014.htm [5] ASKARIHEMMAT M, HONARI S, ROUHIER L, et al. U-Net fixed-point quantization for medical image segmentation[C]//LABELS 2019, HALMICCAI 2019, CuRIOUS 2019. Berlin: Springer, 2019: 115-124. [6] SON S, NAH S, LEE K M. Clustering convolutional kernels to compress deep neural networks[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 225-240. [7] HE Y, LIU P, WANG Z W, et al. Filter pruning via geometric median for deep convolutional neural networks acceleration[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4335-4344. [8] VAZE S, NAMBURETE A. Segmentation of fetal adipose tissue using efficient CNNs for portable ultrasound[C]//PIPPI 2018, DATRA 2018. Berlin: Springer, 2018: 55-65. [9] 董晓, 刘雷, 李晶, 等. 面向稀疏卷积神经网络的GPU性能优化方法[J]. 软件学报, 2020, 31(9): 2944-2964. https://www.cnki.com.cn/Article/CJFDTOTAL-RJXB202009018.htmDONG X, LIU L, LI J, et al. Performance optimizing method for sparse convolutional neural networks on GPU[J]. Journal of Software, 2020, 31(9): 2944-2964(in Chinese). https://www.cnki.com.cn/Article/CJFDTOTAL-RJXB202009018.htm [10] LACHINOV D, SHIPUNOVA E, TURLAPOV V. Knowledge distillation for brain tumor segmentation[C]//International MICCAI Brainlesion Workshop. Berlin: Springer, 2019: 324-332. [11] ISENSEE F, JAEGER P F, KOHL S A A, et al. nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation[J]. Nature Methods, 2021, 18(2): 203-211. doi: 10.1038/s41592-020-01008-z [12] LACHINOV D, VASILIEV E, TURLAPOV V. Glioma segmentation with cascaded U-Net[C]//International MICCAI Brainlesion Workshop. Berlin: Springer, 2018: 189-198. [13] HUANG Z C, WANG Z X, CHEN J, et al. Real-time colonoscopy image segmentation based on ensemble knowledge distillation[C]//2020 5th International Conference on Advanced Robotics and Mechatronics(ICARM). Piscataway: IEEE Press, 2020: 454-459. [14] LI K, YU L Q, WANG S J, et al. Towards cross-modality medical image segmentation with online mutual knowledge distillation[EB/OL]. (2020-10-04)[2021-08-30]. https://arxiv.org/abs/2010.01532. [15] MANGALAM K, SALZAMANN M. On compressing U-net using knowledge distillation[EB/OL]. (2016-12-01)[2021-08-30]. https://arxiv.org/abs/1812.00249. [16] VAZE S, XIE W, NAMBURETE A. Low-memory CNNs enabling real-time ultrasound segmentation towards mobile deployment[J]. IEEE Journal of Biomedical and Health Informatics, 2020: 20(4): 1059-1069. [17] ROTH H R, LU L, FARAG A, et al. DeepOrgan: Multi-level deep convolutional networks for automated pancreas segmentation[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin: Springer, 2015: 556-564. [18] CARDONA A, SAALFELD S, PREIBISCH S, et al. An integrated micro- and macroarchitectural analysis of the Drosophila brain by computer-assisted serial section electron microscopy[J]. PLoS Biology, 2010, 8(10): e1000502. [19] HEO B, LEE M, YUN S, et al. Knowledge transfer via distillation of activation boundaries formed by hidden neurons[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Palo Alto: AAAI, 2019, 33: 3779-3787. [20] TE G S, LIU Y L, HU W, et al. Edge-aware graph representation learning and reasoning for face parsing[C]//European Conference on Computer Vision. Berlin: Springer, 2020: 258-274. [21] BEZDEK J C, EHRLICH R, FULL W. FCM: The fuzzy c-means clustering algorithm[J]. Computers & Geosciences, 1984, 10(2-3): 191-203. [22] QIN D, BU J J, LIU Z, et al. Efficient medical image segmentation based on knowledge distillation[J]. IEEE Transactions on Medical Imaging, 2021, 40(12): 3820-3831. doi: 10.1109/TMI.2021.3098703 -