基于 FPGA 的 YOLOv4-tiny 网络的硬件加速与实现
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Hardware Acceleration and Implementation of YOLOv4-Tiny Network Based on FPGA
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    :目的 为解决在资源受限的情况下, 目标检测算法在边缘硬件平台中提高计算性能和能效比,提出了一种基 于现场可编程门阵列(Field Programmable Gate Array, FPGA)的边缘硬件平台实现对 YOLOv4-tiny 网络的加速设 计并进行验证。 方法 采用了高层次综合技术(High level Synthesis)对算法的算子和模块进行了高度并行的设计与 优化。 为提高设计吞吐量,采用了双缓冲策略增加系统资源利用率,并利用卷积层与 BN(Batch Normalization)层融 合和量化模型技术,减少了模型参数量,提高了计算密度。 结果 在 PYNQ-Z2 平台上进行实验,结果表明:加速器 的计算性能为 15. 33 GOPS,总功耗为 2. 65 W,相较于同类研究的 FPGA 平台计算性能提高了 2. 79 倍,相较于 CPU 平台的能效比提高了 29. 5 倍。 结论 对 YOLOv4-tiny 网络在边缘 FPGA 平台加速效果有所提升,为目标检测算法 在硬件平台的加速研究提供了参考。

    Abstract:

    Objective To enhance the computational performance and energy efficiency ratio of object detection algorithms on edge hardware platforms under resource-constrained conditions this paper proposes and verifies an edge hardware platform based on field programmable gate array FPGA for accelerating the YOLOv4-tiny network. Methods High-level synthesis HLS was used to design and optimize the operators and modules of the algorithm in a highly parallel manner. To improve design throughput a double-buffering strategy was adopted to increase system resource utilization. Additionally techniques such as fusion of convolutional and batch normalization BN layers and model quantization were applied to reduce model parameters and enhance computational density. Results Experiments conducted on the PYNQZ2 platform demonstrate that the accelerator achieved a computational performance of 15. 33 GOPS with a total power consumption of 2. 65 W. Compared to existing FPGA platforms the proposed design improved computational performance by 2. 79 times while achieving a 29. 5-fold increase in energy efficiency ratio compared to CPU platforms. Conclusion The proposed method effectively enhances the acceleration of the YOLOv4-tiny network on edge FPGA platforms providing a valuable reference for the acceleration research of object detection algorithms on hardware platforms.

    参考文献
    相似文献
    引证文献
引用本文

李 雷,黎远松,石 睿.基于 FPGA 的 YOLOv4-tiny 网络的硬件加速与实现[J].重庆工商大学学报(自然科学版),2026,43(3):45-52
LI Lei LI Yuansong SHI Rui. Hardware Acceleration and Implementation of YOLOv4-Tiny Network Based on FPGA[J]. Journal of Chongqing Technology and Business University(Natural Science Edition),2026,43(3):45-52

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2026-05-19
×
2025年《中国学术期刊影响因子年报》发布