摘要:针对传统核模糊聚类(KFCM)算法无法克服边界噪声数据影响且对初始聚类中心敏感的不足,提出一种基于样本密度和最大类间方差法相结合的KFCM算法。该算法在传统的KFCM算法中引入样本分布密度作为权重,克服噪声及边界数据对分类中心的影响,使样本的聚类效果更好,同时还可以分析各样本对聚类的贡献程度。此外利用最大类间方差法对样本密度进行分割,得到各类中心点并以此作为KFCM算法的初始聚类中心,克服了传统算法对初始值敏感的不足。对各种实际数据集的测试结果均显示出新算法的优良性能。最后利用新算法对轴承故障进行诊断,试验结果表明新算法的诊断率优于传统的聚类算法。
ABSTRACT: To solve the problem that traditional Kernel Fuzzy C-Means Algorithm (KFCM) is very sensitive to outliers and noises in the training set, a novel Kernel fuzzy C-Means Algorithm based on distribution density around samples and maximum variance between clusters method is proposed in this paper. In the proposed method, the value of distribution density around samples is used as weight values according to the feature of sample distributing to overcome the shortcomings of KFCM’s sensitivity to noises and outliers. The maximum variance between clusters methods is applied to segment the sample’s distribution density vector, whose segmentation results are used to as the initial centers of the proposed KFCM algorithm, which overcomes the problems of KFCM sensitivity to initial values. The proposed method can not only solve the problems of traditional KFCM’s sensitivity to noises and outliers and sensitivity to initial values, but also can be applied to analyze samples’ contribution to clustering performance. The experimental results with various real data sets illustrate the effectiveness of the proposed algorithm. The proposed method is applied to fault diagnosis field which outperforms traditional cluster methods.