重庆工商大学学报（自然科学版）

引用本文:	周孟然a, 陆鹏b.基于级联注意力的结肠息肉图像分割算法研究(J/M/D/N,J:杂志，M：书，D：论文，N：报纸).期刊名称,2026，43（1）：1-10
	CHEN X. Adap tive slidingmode contr ol for discrete2ti me multi2inputmulti2 out put systems[ J ]. Aut omatica, 2006, 42(6): 4272-435

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 230次下载 460次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于级联注意力的结肠息肉图像分割算法研究
周孟然a, 陆鹏b^1,2
1.安徽理工大学 a. 电气与信息工程学院;2.b. 计算机科学与工程学院, 安徽淮南 232000

摘要:

目的针对现有 Transformer 模型在息肉图像分割中存在注意力分散以及作为编码器提取的多级特征在融合时易产生信息丢失导致的分割精度不高的问题,提出一种新的分割模型 PVT-CAMNet。方法在该模型中,使用金字塔式 Transformer(Pyramid Vision Transformer, PVT)作为编码器,接着设计了多尺度特征注意力提取模块(Multiscale Feature Attention Extraction,MFAE)和层间注意力聚合模块(Inter-layer Attention Aggregation, IA)。其中,PVT通过其自注意力机制保证了模型的泛化能力,MFAE 使用不同大小的滤波器多尺度提取特征,旨在缓解注意力分散问题;IA 交互融合不同层级特征,有效解决多级特征融合产生的信息丢失问题;最后引入全局上下文模块 (Global Context,GC) 使模型更好地理解特征图之间的像素依赖关系。结果在 Kvasir、CVC - ClinicDB、CVC -ColonDB 和 ETIS 数据集上进行了评估,相较于最优基线模型,mDice、mIoU 分别提高了 1. 76%、0. 81%、1. 51%、 1. 74%、3. 15%、2. 65% 和 1. 73%、3. 84%。结论 PVT-CAMNet 的学习性能和泛化性能均优于其他先进方法,在息肉图像分割上具有一定的应用价值。

关键词: 息肉图像分割多尺度注意力提取层间注意力聚合全局上下文

DOI：

分类号:

基金项目:

Research on Colon Polyp Image Segmentation Algorithm Based on Cascaded Attention

ZHOU Mengrana， LU Pengb

a. School of Electrical and Information Engineering b. School of Computer Science and Engineering Anhui University of Science and Technology Huainan 232000 Anhui China

Abstract:

Objective Aiming at the problems of scattered attention in existing Transformer models for polyp image segmentation and the low segmentation accuracy caused by information loss during the fusion of multi-level features extracted by the encoder a new segmentation model named PVT-CAMNet is proposed. Methods In this model the Pyramid Vision Transformer PVT was used as the encoder. Then a Multi-scale Feature Attention Extraction MFAE module and an Inter-layer Attention Aggregation IA module were designed. Among them the PVT ensured the generalization ability of the model through its self-attention mechanism. The MFAE used filters of different sizes to extract features at multiple scales to alleviate the problem of scattered attention. The IA interactively fused features at different levels to effectively solve the problem of information loss caused by the fusion of multi-level features. Finally a Global Context GC module was introduced to enable the model to better understand the pixel dependency relationship between feature maps. Results Evaluations were carried out on the Kvasir CVC-ClinicDB CVC-ColonDB and ETIS datasets.Comparing the performance of the proposed PVT-CAMNet model with that of the optimal baseline model the mDice valuesof PVT-CAMNet increased by 1. 76% 1. 51% 3. 15% and 1. 73% respectively and the mIoU values of PVT-CAMNet increased by 0. 81% 1. 74% 2. 65% and 3. 84% respectively on these four datasets. Conclusion PVT-CAMNet is superior to other advanced methods in both learning performance and generalization capability demonstrating significant application value in polyp image segmentation.

Key words: polyp image segmentation multi-scale attention extraction inter-layer attention aggregation global contextnalysis

关注微信二维码

期刊界 勤云,期刊,采编