基于局部密度离群点检测k-means算法

首页 > 按月查看>2021年第4月 >30-35

基于局部密度离群点检测k-means算法
DOI:
                        
                    
作者:
                        刘凤， 戴家佳， 胡阳刘凤， 戴家佳， 胡阳
贵州大学 数学与统计学院，贵阳 550025
在知网中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
基金项目:

The k-means Algorithm Based on Local Density Outlier Detection

Author:

LIU Feng, DAI Jia-jia, HU Yang
LIU Feng, DAI Jia-jia, HU Yang
School of Mathematics and Statistics, Guizhou University, Guiyang 550025, China
在知网中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

针对数据集的聚类过程容易受到离群值的影响这一问题，提出了局部密度离群值检测k-means算法，即先对数据集使用局部密度离群值检测方法检测离群值，先把离群值去除，再进行k-means聚类，算法的有效性通过Davies-Bouldin指标(DB)、Dunn指标和Silhouette指标进行评价，在人工生成的数据集与UCI数据集上验证，去除离群值，再使用k-means算法得到的聚类结果相比原始数据集进行k-means算法聚类结果较好，并且用在疫情数据分析上，对安徽省、北京市、福建省、广东省等24个省、市、自治区2020年2月18日新型冠状病毒肺炎确诊人数进行聚类分析，得到的去除离群值在使用k-means算法相比原始数据集进行k-means算法聚类结果较好，该结果能帮助更好地在实际中怎么去做决策以及更好地降低经济损失。

关键词:k-means;离群点;LOF;评价指标

Abstract:

In view of that the clustering process of data set is easily affected by outliers, the local density outlier detection k-means algorithm is proposed. The proposed method firstly detects the outliers of the data set by using local density outlier detection method, removes the outliers at first and then conducts k-means clustering. The validity of the algorithm is evaluated by Davies-Bouldin index, Dunn index and Silhouette index and is verified by artificial data set and UCI data set, and the outliers are removed. The obtained clustering results by using k-means algorithm are better than original data set k-means algorithm clustering results, this method is used for COVID-19 epidemic data analysis and the clustering analysis of the method is conducted on the confirmed infected number of COVID-19 in 24 provinces, municipalities and autonomous regions such as Anhui, Beijing, Fujian, Guangdong and so on on February 18, 2020. The clustering results using k-means algorithm by removing outliers are better than the clustering results of original data set using k-means algorithm, and the results can be conducive to how to make decision in practical work and better reduce economic cost.

Key words:k-means; outliers; LOF; evaluation index

引用本文

刘凤, 戴家佳, 胡阳.基于局部密度离群点检测k-means算法[J].重庆工商大学学报（自然科学版）,2021,38(4):30-35
LIU Feng, DAI Jia-jia, HU Yang. The k-means Algorithm Based on Local Density Outlier Detection[J]. Journal of Chongqing Technology and Business University(Natural Science Edition）,2021,38(4):30-35

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:
最后修改日期:
录用日期:
在线发布日期: 2021-07-13

引用本文

分享

文章指标

历史