面向非平衡数据流的重采样集成分类方法研究
作者:

Research on Resampling Ensemble Classification Method for Imbalanced Data Streams
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
    摘要:

    目的 类不平衡和概念漂移是数据流分类任务中的两个主要挑战,当它们同时发生时,将显著影响数据流分 类算法的性能,因此,针对传统数据流分类算法难以应对类别不平衡和概念漂移同时存在的问题,提出一种专注于 非平衡数据流的重采样集成模型。 方法 首先,设计一种适用于数据流的边界过采样方法,利用三角形重心的特点, 在边界样本内侧合成新样本,使得块中的少数类得到增强的同时,尽可能保持数据原有分布并且避免引入新的概 念,有效改善数据块中类别不平衡情况;在此基础上,融合时间衰减策略和加权集成策略,设计基于马修斯相关系 数作为权重的动态加权集成模型,解决概念漂移问题,同时增强分类挖掘模型的自适应性和健壮性。 结果 在 3 个 真实数据流和 6 个模拟数据流上的仿真实验结果表明:所提方法在非平衡数据流场景中,展现出对多数类和少数 类均有高效的识别能力,并且对突变和增量概念漂移都具有更好的漂移感知和适应能力,分类模型整体性能优于 对比算法。 结论 实验验证:所提方法构建出一种鲁棒的非平衡数据流分类模型,在处理非平衡数据流和适应两种 类型的概念漂移方面具有更好的优势。

    Abstract:

    Objective Class imbalance and concept drift are two main challenges in data stream classification tasks. When they occur simultaneously they significantly affect the performance of data stream classification algorithms. Therefore to address the difficulty of traditional data stream classification algorithms in handling the simultaneous occurrence of class imbalance and concept drift a resampling ensemble model focused on imbalanced data streams was proposed. Methods Firstly a boundary oversampling method tailored for data streams was designed. By leveraging the characteristics of the triangular center of gravity new samples were synthesized inside boundary samples to enhance the minority class within the block while striving to maintain the original data distribution and avoid introducing new concepts. This effectively improved the class imbalance in the data block. On this basis a dynamic weighted ensemble model based on Matthews correlation coefficient as weights was designed by integrating the time decay strategy and weighted ensemble strategy. This model solved the problem of concept drift and enhanced the adaptability and robustness of the classification mining model.

    参考文献
    相似文献
    引证文献
引用本文

章涂义,刘三民,陈燕菲,余文韬,朱 健.面向非平衡数据流的重采样集成分类方法研究[J].重庆工商大学学报(自然科学版),2025,42(3):34-43
HANG Tuyi, LIU Sanmin, CHEN Yanfei, YU Wentao, ZHU Jian. Research on Resampling Ensemble Classification Method for Imbalanced Data Streams[J]. Journal of Chongqing Technology and Business University(Natural Science Edition),2025,42(3):34-43

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 在线发布日期: 2025-05-14
×
2024年《重庆工商大学学报(自然科学版)》影响因子显著提升