重庆工商大学学报（自然科学版）

引用本文:	高艾国,郑晓亮.基于 R-DCAformer 的结直肠息肉分割模型(J/M/D/N,J:杂志，M：书，D：论文，N：报纸).期刊名称,2024，41（5）：49-57
	CHEN X. Adap tive slidingmode contr ol for discrete2ti me multi2inputmulti2 out put systems[ J ]. Aut omatica, 2006, 42(6): 4272-435

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 694次下载 912次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于 R-DCAformer 的结直肠息肉分割模型
高艾国,郑晓亮
安徽理工大学电气与信息工程学院, 安徽淮南 232001

摘要:

目的现有 Transformer 模型虽然在形态复杂的结直肠息肉分割中拥有较高准确率,但是其注意力分散,编码器输出多级语义信息在融合中会产生信息丢失,限制了模型准确率进一步提高,针对此问题,提出一种新的肠道息肉图像分割模型:双通管道聚合网络( Dual -Channel Aggregation Transformer, R -DCAformer) 。方法 R -DCAformer 模型使用金字塔混合的 Transformer( Mix Transformer, MIT) 和 Resnet18 充当编码器,设计了双通道聚合 ( Dual - Channel Aggregation, DCA) 模块充当解码器。 DCA 解码器由注意力聚合模块( Attention Aggregation, AA) 和双通道特征聚合模块( Dual-Channel Feature Fusion,DFF) 组成,其中,金字塔 MIT 编码器可以为模型提供充足泛化能力, AA 模块可以通过融合 Resnet18 的额外特征限制模型 MIT 中的注意力分散,DFF 模块则可以缓解多级语义信息融合中的信息丢失问题。结果泛化能力实验中,R-DCAformer 在 CVC-ColonDB 中相比于基线模型中最优的 mDice、 mIoU 和 MAE 分别提高了 2. 10%、1. 65%和 22. 5%,在 ETIS 中,相比于基线模型中最优的 mDice、mIoU 和 MAE 分别提高了 2. 56%、2. 12%和 15%;模型在 CVC-ClinicDB 数据集上,相比于基线模型中的最优 mDice、mIoU 提高了约 0. 85% 、1. 35%;在 Kvasir-SEG 数据集上,相比于基线模型中的最优 mDice、mIoU 和 MAE 提高了约 1. 19% 、1. 97% 和 17. 39%。此外还通过消融实验和注意力图论证了本文所提出模块的有效性。结论 R-DCAformer 在学习和泛化实验中效果都较为优异,总体上优于对比的基线模型,为结直肠息肉分割提供了新的高性能模型。

关键词: 息肉图像分割深度学习双通道聚合注意力聚合泛化能力

DOI：

分类号:

基金项目:

Colorectal Polyp Segmentation Model Based on R-DCAfomer

GAO Aiguo, ZHENG Xiaoliang

School of Electrical and Information Engineering Anhui University of Science & Technology Anhui Huainan 232001 China

Abstract:

Objective Although the existing Transformer model has high accuracy in segmenting colorectal polyps with complex morphology the distraction of the Transformer model and the loss of information in the fusion of its encoder outputting multilevel semantic information limit the further improvement of the model?? s accuracy. Based on this a novel image segmentation model the Dual-Channel Aggregation Transformer R-DCAformer for intestinal polyps was proposed. Methods The R-DCAformer model used a pyramid mix Transformer MIT and Resnet18 to act as an encoder and a dual- channel aggregation DCA module was designed to act as a decoder. The DCA decoder consisted of an attention aggregation AA module and a dual-channel feature fusion DFF module. In this model the pyramid MIT encoder provided sufficient generalization ability for the model the AA module limited the distraction in the model MIT by fusing the additional features of Resnet18 and the DFF module alleviated the problem of information loss in the fusion of multi-level semantic information. Results In the generalization ability experiment R-DCAformer improved the optimal mDice mIoU and MAE by 2. 10% 1. 65% and 22. 5% respectively in CVC-ColonDB compared with the optimal ones in the baseline model. The optimal mDice mIoU and MAE in ETIS were improved by 2. 56% 2. 12% and 15% respectively compared with the optimal ones in the baseline model. The model improved the optimal mDice and mIoU by about 0. 85% and 1. 35% in the CVC-ClinicDB dataset compared with the optimal ones in the baseline model and the optimal mDice mIoU and MAE on the Kvasir-SEG dataset were improved by about 1. 19% 1. 97% and 17. 39% respectively compared with those in the baseline model. The effectiveness of the module proposed in this paper was also demonstrated by ablation experiments and attention graphs. Conclusion R-DCAformer is more effective in both learning and generalization experiments and generally outperforms the compared baseline models providing a new high- performance model for colorectal polyp segmentation.

Key words: polyp image segmentation deep learning dual-channel aggregation attention aggregation generalization ability

关注微信二维码

期刊界 勤云,期刊,采编