基于改进堆栈自编码的诊断错误标签修正

摘要
图/表
参考文献(22)
相关文章 (15)

全文: PDF (3241 KB) HTML (1 KB)
输出: BibTeX | EndNote (RIS)

摘要在基于数据驱动的故障诊断领域，正确的标签样本是诊断准确度的保障，但由于人工标记等原因，训练样本常受到错误标签的干扰。针对错误标签的问题，提出基于改进堆栈自编码的错误标签修正方法。该方法通过堆栈自编码和孤立森林给样本赋予伪标签，调整编码器对样本的注意程度，从而使编码器更注重于正确样本。基于数据分布偏差的考虑，利用基于随机森林的交叉验证获取样本的信息熵，对标签进行修正。齿轮和轴承实验表明，该方法在多个错误标签比例下均能降低样本的错误标签率，正确修正错误标签，提高故障诊断的准确率。

	服务

	把本文推荐给朋友
	加入我的书架
	加入引用管理器
	E-mail Alert
	RSS
	作者相关文章
	张旭1
	黄亦翔1
	张轩1
	肖登宇1
	刘成良1
	李怀洋2
	朱涛2

关键词 ：错误标签, 堆栈自编码, 孤立森林, 信息熵, 齿轮故障

Abstract：In the field of data-based fault diagnosis, correct label samples are the guarantee of diagnostic accuracy. but for unavoidable reasons, training samples are often disturbed by noise labels. In response to this problem, a noise label correction method based on improved Stacked Auto-Encoder is proposed. This method assigns pseudo-labels to samples through Stacked Auto-Encoder and Isolation Forest, adjusts the degree of attention of Stacked Auto-Encoder to samples, thereby making the Stacked Auto-Encoder’s focus on the correct samples. Considering the deviation caused by the data distribution, the cross-validation based on random forest is used to obtain the sample entropy of the sample to correct the label. The gear and bearing experiments show that the method can reduce the noise label rate of the sample, correct the noise label correctly, and improve the accuracy of fault classification under multiple noise ratios.

Key words： Noise label Stacked Auto-Encoder Isolation Forest Entropy Gear fault

收稿日期: 2020-07-27 出版日期: 2022-01-15

引用本文:

张旭1,黄亦翔1,张轩1,肖登宇1,刘成良1,李怀洋2,朱涛2. 基于改进堆栈自编码的诊断错误标签修正[J]. 振动与冲击, 2022, 41(1): 78-87.
ZHANG Xu1, HUANG Yixiang1, ZHANG Xuan1, XIAO Dengyu1, LIU Chengliang1, LI Huaiyang2, ZHU Tao2. Diagnostic noise label correction based on improved stacked auto-encoder. JOURNAL OF VIBRATION AND SHOCK, 2022, 41(1): 78-87.

链接本文:

http://jvs.sjtu.edu.cn/CN/ 或 http://jvs.sjtu.edu.cn/CN/Y2022/V41/I1/78

[1] Liu J, Song C, Zhao J, et al. Manifold-preserving sparse graph based ensemble FDA for industrial label-noise fault classification[J]. IEEE Transactions on Instrumentation and Measurement, IEEE, 2020, 69(6): 2621–2634.
[2] Quinlan J R. Induction of decision trees[J]. Machine Learning, 1986, 1(1): 81–106.
[3] 宫辰,张闯,王启舟.标签噪声鲁棒学习算法研究综述[J].航空兵器,2020,27(03):20-26.
GONG Chen, ZHANG Chuang, WANG Qizhou. A Survey of Label Noise Robust Learning Algorithms[J]. Aero Weaponry, 2020,27(03):20-26.
[4] Zhang C, Recht B, Bengio S, et al. Understanding deep learning requires rethinking generalization[C]. 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings, 2019.
[5] 罗俊杰,孙江文,王崇骏,等. 基于Bayes的有噪训练集去噪方法研究[J].计算机科学,2008,35(09):213-216.
LUO Junjie, SUN Jiangwen, WANG Chongjun, et al. Identifying and Correcting Mislabled Training Instances Using Bayes[J]. Computer Science, 2008, 35(9): 213–216.
[6] 高琼. 分类问题中的标签噪声研究[D]. 西安电子科技大学, 2019.
GAO Qiong. Study on Label Noise in the Classification[D] Xi Dian university, 2019
[7] 夏建明, 杨俊安. 基于稀疏流形聚类嵌入模型和L1范数正则化的标签错误检测[J]. 控制与决策, 2014, 29(6): 1103–1108.
XIA Jianming, YANG Junan. Labeling errors detecting and correcting algorithm based on sparse manifold clustering and embedding and L1 norm regularization[J], Control and Decision, 2014, 29(6): 1103–1108.
[8] Liu T, Tao D. Classification with Noisy Labels by Importance Reweighting[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, IEEE, 2016, 38(3): 447–461.
[9] Caron M, Bojanowski P, Joulin A, et al. Deep clustering for unsupervised learning of visual features[C]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 11218 LNCS: 139–156.
[10] Xiao J, Tian Y, Xie L, et al. A Hybrid Classification Framework Based on Clustering[J]. IEEE Transactions on Industrial Informatics, 2020, 16(4): 2177–2188.
[11] 刘艺. 基于知识图谱的海量数据错误标签的纠正[D]. 上海交通大学, 2014.
LIU Yi. A knowledge based approach for tackling mislabeled multi-class big social data[D], Shanghai Jiao Tong University, 2014.
[12] Jiang L, Zhou Z, Leung T, et al. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels[C]. 35th International Conference on Machine Learning, ICML 2018, 2018, 5: 3601–3620.
[13] Han B, Yao Q, Yu X, et al. Co-teaching: Robust training of deep neural networks with extremely noisy labels[C]. Advances in Neural Information Processing Systems, 2018, 2018-Decem (NeurIPS): 8527–8537.
[14] Guo S, Huang W, Zhang H, et al. CurriculumNet: Weakly supervised learning from large-scale web images[C]. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, 11214 LNCS: 139–154.
[15] Cao Z, Yang G, Chen Q, et al. Breast tumor classification through learning from noisy labeled ultrasound images[J]. Medical Physics, 2020, 47(3): 1048–1057.
[16] Hinton G E Z R S. Autoencoders, minimum description length and Helmholtz free energy[C]. Advances in neural information processing systems. 1994: 3–10.
[17] Liu F T, Ting K M, Zhou Z H. Isolation forest[C]. Proceedings - IEEE International Conference on Data Mining, ICDM, IEEE, 2008: 413–422.
[18] 陈仁祥，吴昊年，韩彦峰,等. 融合无量纲指标与信息熵的不同转速下旋转机械故障诊断[J]. 振动与冲击, 2019, 38(11): 219–227.
CHEN Renxiang, WU Haonian, HAN Yanfeng, el al. Rotating machinery fault diagnosis under different rotating speeds based on fusion of non-dimensional index and information entropy[J]. Journal of Vibration and Shock, 2019, 38(11): 219–227.
[19] 胡茑庆，陈徽鹏，程哲,等. 基于经验模态分解和深度卷积神经网络的行星齿轮箱故障诊断方法[J]. 机械工程学报, 2019, 55(7):9-18
HU Xiaoqing, CHEN huipeng, CHENG Zhe et al. Fault Diagnosis for Planetary Gearbox Based on EMD and Deep Convolutional Neural Networks[J]. Journal of Mechanical Engineering, 2019, 55(7):9-18
[20] Sanchez R V, Lucero P, Macancela J C, et al. Gear Crack Level Classification by Using KNN and Time-Domain Features from Acoustic Emission Signals under Different Motor Speeds and Loads[C]. Proceedings - 2018 International Conference on Sensing, Diagnostics, Prognostics, and Control, SDPC 2018, 2019(March 2019): 465–470.
[21] 赵帅，黄亦翔，王浩任,等. 基于随机森林与主成分分析的刀具磨损评估[J]. 机械工程学报, 2017, 53(21): 181–189.
ZHAO Shuai, HUANG Yixiang, WANG Haoren el al. Random Forest and Principle Components Analysis Based on Health Assessment Methodology for Tool Wear[J]. Journal of Mechanical Engineering, 2017, 53(21): 181–189.
[22] 李亚，黄亦翔，赵路杰,等. 基于t分布邻域嵌入与XGBoost的刀具多工况磨损评估[J]. 机械工程学报, 2020, 56(01): 132–140.
LI Ya, HUANG Yixiang, ZHAO Lujie et al.Multi-condition Wear Evaluation of Tool Based on T-SNE and XGBoost[J]. Journal of Mechanical Engineering, 2020, 56(01): 132–140.