重庆工商大学学报（自然科学版）

引用本文:	周孟然a,王澳b.基于 DETR 的轻量级遥感图像目标检测算法(J/M/D/N,J:杂志，M：书，D：论文，N：报纸).期刊名称,2026，43（2）：54-60
	CHEN X. Adap tive slidingmode contr ol for discrete2ti me multi2inputmulti2 out put systems[ J ]. Aut omatica, 2006, 42(6): 4272-435

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 167次下载 475次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于 DETR 的轻量级遥感图像目标检测算法
周孟然a,王澳b^1,2
1.安徽理工大学 a. 电气与信息工程学院;2.b. 计算机科学与工程学院, 安徽淮南 232000

摘要:

目的针对传统遥感图像目标检测模型在无人机、卫星等低算力场景下难以部署的问题,实现保持检测精度的同时,降低模型的参数量,提出一种基于 DETR 的轻量级遥感图像目标检测算法。方法该模型首先采用 EfficientViT 特征提取模块作为轻量级骨干网络,用于图像特征提取和筛选;同时,设计了一个轻量级高效混合编码器,旨在降低模型参数量和计算量的同时,保持检测精度,该编码器由 S-AIFI 模块和 MSFM 模块组成,其中 S-AIFI 模块专注于处理深层特征,以增强对特征信息的聚合能力。而 MSFM 模块通过多尺度特征融合提高模型在遥感图像中对不同大小目标的检测能力;最后,引入了 shape-IoU 损失函数,以进一步提高模型的检测精度。结果在 DOTA-v1 数据集和 SIMD 数据集上进行实验,该模型的 mAP 达到了 75. 5%及 81. 9%,其参数量降低到了 10. 3 M。结论训练后的模型具有较小的内存占用和参数量,适用于计算资源有限的遥感图像处理应用场景。

关键词: 遥感图像轻量级网络 efficientvit 多特征融合

DOI：

分类号:

基金项目:

Lightweight Remote Sensing Image Object Detection Algorithm Based on DETR

ZHOU Mengrana,WANG Aob

a. School of Electrical and Information Engineering b. School of Computer Science and Engineering Anhui University of Science and Technology Huainan 232000 Anhui China

Abstract:

Objective To address the deployment challenge of traditional object detection models in low-computation scenarios e. g. drones and satellites this study proposes a lightweight remote sensing image object detection algorithm based on DETR Detection Transformer which reduces model complexity while preserving detection accuracy. Methods Firstly the proposed model employed an EfficientViT feature extraction module as a lightweight backbone for image feature extraction and selection. Then a lightweight and efficient hybrid encoder was designed to reduce the number of parameters and computational cost of the model while maintaining detection accuracy. This encoder comprised two key components the S-AIFI Slim-Attention-based Intrascale Feature Interaction module which focused on processing deep features to enhance contextual aggregation of feature information and the MSFM Multi-Scale Feature Fusion Module which improved detection capability for objects of varying sizes in remote sensing images through effective multi-scale fusion. Furthermore a shape-IoU loss function was incorporated to refine the detection precision of the model. Results Experiments on the DOTA-v1 and SIMD datasets showed that the model achieved mean average precision mAP scores of 75. 5% and 81. 9% respectively with its parameter count reduced to 10. 3 M. Conclusion The trained model exhibits a small memory footprint and low parameter count making it suitable for remote sensing image processing applications with limited computational resources.

Key words: remote sensing image lightweight network EfficientViT multi-feature fusion