摘要: |
为有效防治大气污染,依据大气污染的各项指标值来预测城市空气质量级别并给出起主导作用的因素,提出基于随机森林的空气质量等级分类预测方法;随机森林模型直接给出影响空气质量指标的重要性评分以便于找出最重要的影响因素,比较不同的数据挖掘方法,结果显示:随机森林分类预测的准确率最高,因此该模型可广泛应用于空气质量预测中;测试集结果显示随机森林方法不易受噪声影响且泛化误差较低,重要性评分给出细颗粒物和可吸入颗粒影响为最重要的两个因素,并以保定市为例有针对性地给出提高空气质量的建议。 |
关键词: 随机森林 空气质量 分类预测 PM2.5 |
DOI: |
分类号: |
基金项目: |
|
Air Quality Classification Prediction Based on Random Forest Model |
MENG Qian
|
Abstract: |
In order to prevent air pollution and to predict urban pollution grade and factors which play a leading role in air pollution based on each index value of air pollution, air quality classification prediction method based on random forest is proposed. Random forest model directly gives the scores influencing the importance of air quality indicators so as to find the most important influencing factor. Comparison of different data mining methods shows that the accuracy of random forest classification prediction is the highest, therefore, this model can be widely used in air quality prediction. Testing results indicate that random forest method is not easy to be affected by noise and has low generalization error. The importance scores show that fine particulate and inhaled particulate are the most important two factors. By taking Baoding City as an example, the suggestions for raising air quality are provided accordingly. |
Key words: random forest air quality classification prediction PM2.5 |