基于端到端深度学习的声源特征清晰化方法

PDF(3568 KB)

振动与冲击 ›› 2023, Vol. 42 ›› Issue (21) : 133-141.

论文

基于端到端深度学习的声源特征清晰化方法

冯罗一，昝鸣，徐中明，张志飞，李贞贞

作者信息 +

Sound source feature defuzzification method based on end-to-end deep learning

FENG Luoyi, ZAN Ming, XU Zhongming, ZHANG Zhifei, LI Zhenzhen

Author information +

文章历史 +

摘要

基于深度学习的无网格声源识别方法突破了网格划分的限制，具有精度高、预测速度快的优点。在利用传统波束形成地图(Conventional Beamforming Map, CB Map)提取声源位置特征时，随着传声器数目的减少， CB Map的成像性能会下降，进而影响深度学习模型预测声源位置的精度。为了提高深度学习无网格方法(Deep Learning Grid-free Method, DL-GFM)的通用性，使其在较少传声器阵列的情况下有良好的性能，本文提出一种基于端到端深度学习模型U-Net的阵列转换方法(Array Converted Method, ACM)，对CB Map进行清晰化。首先使用18通道阵列CB Map作为输入、64通道阵列CB Map作为目标训练U-Net模型，然后使用训练好的残差神经网络(Residual Network, ResNet)作为DL-GFM方法的预测模型进行无网格声源坐标识别。仿真结果表明ACM方法具有良好的旁瓣消除和主瓣宽度减小能力，并且在1~8个声源范围内对非训练声源数目的情况同样有效。对于3声源的情况，ACM方法在全频段上提升了DL-GFM方法的精度。最后通过1、2、3个声源的实验验证了提出方法的有效性和可行性。

Abstract

The grid-free sound source identification method using deep learning breaks through the limitation of grid division, and it has the advantages of high accuracy and fast prediction speed. When the Conventional Beamforming map (CB Map) is used to extract the sound source location features, the imaging performance of CB Map will be distorted with the decrease of the number of microphones, which will affect the accuracy of the deep learning model to predict the sound source location. To improve the generality of Deep Learning Grid-free Method (DL-GFM) and make it have good performance in different microphone array test conditions, an Array Converted Method (ACM) based on the end-to-end deep learning model U-Net to clear CB Map is proposed in this paper. Firstly, 18-channel array CB Map was used as the input and 64-channel array CB Map was used as the target to train U-Net model. Residual Network (ResNet) is trained as the model of DL-GFM method for source coordinate identification. The simulation results show that the ACM method has good ability of sidelobe elimination and main lobe width reduction, and it is also effective for the number of non-trained sound sources within the range of 1 to 8 sound sources. For the case of three sound sources, the ACM method improves the accuracy of DL-GFM method in the whole frequency band. Finally, the effectiveness and feasibility of the proposed method are verified by experiments with 1, 2 and 3 sound sources.

导出引用

冯罗一，昝鸣，徐中明，张志飞，李贞贞. 基于端到端深度学习的声源特征清晰化方法[J]. 振动与冲击, 2023, 42(21): 133-141

FENG Luoyi, ZAN Ming, XU Zhongming, ZHANG Zhifei, LI Zhenzhen. Sound source feature defuzzification method based on end-to-end deep learning[J]. Journal of Vibration and Shock, 2023, 42(21): 133-141

参考文献

[1] WILLIAMS J R. Fast Beam-Forming Algorithm[J]. The Journal of the Acoustical Society of America, 1968, 44(5): 1454-1455.
[2] HOWELLS P W. Intermediate frequency side-lobe canceller: U.S. Patent 3,202,990[P]. 1965-8-24.
[3] SARRADJ E, SCHULZE C, ZEIBIG A. Identification of noise source mechanisms using orthogonal beamforming[J]. Noise and vibration: emerging methods, 2005.
[4] DOUGHERTY R P. Functional beamforming for aeroacoustic source distributions[C]//20th AIAA/CEAS aeroacoustics conference. 2014: 3066.
[5] SUZUKI T. L1 generalized inverse beam-forming algorithm resolving coherent/incoherent, distributed and multipole sources[J]. Journal of Sound and Vibration, 2011, 330(24): 5835-5851.
[6] BROOKS T F, HUMPHREYS W M. A deconvolution approach for the mapping of acoustic sources (DAMAS) determined from phased microphone arrays[J]. Journal of sound and vibration, 2006, 294(4-5): 856-879.
[7] CHIARIOTTI P, MARTARELLI M, CASTELLINI P, et al. Acoustic beamforming for noise source localization–Reviews, methodology and applications[J]. Mechanical Systems and Signal Processing, 2019, 120: 422-448.
[8] 杨洋,褚志刚.高性能波束形成声源识别方法研究综述[J].机械工程学报,2021,57(24):166-183.
YANG Yang，CHU Zhigang. A review of high-performance beamforming methods for acoustic source identification[J]. Journal of Mechanical Engineering,2021,57(24):166-183.
[9] LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. nature, 2015, 521(7553): 436-444.
[10] 赵春华,胡恒星,陈保家,张毅娜,肖嘉伟.基于深度学习特征提取和WOA-SVM状态识别的轴承故障诊断[J].振动与冲击,2019,38(10):31-37+48.
ZHAO Chunhua,HU Hengxing,CHEN Baojia, et al. Bearing fault diagnosis based on the deep learning feature extractionand WOA-SVM state recognition[J]. Journal of Vibration and Shock,2019,38(10):31-37.
[11] BIANCO M J, GERSTOFT P, TRAER J, et al. Machine learning in acoustics: Theory and applications[J]. The Journal of the Acoustical Society of America, 2019, 146(5): 3590-3628.
[12] 李琛,黄兆琼,徐及,郭新毅,宫在晓,颜永红.使用深度学习的多通道水下目标识别[J].声学学报,2020,45(04):506-514.
LI Chen,HUANG Zhaoqiong,XU Ji, et al. Multi-channel underwater target recognition using deep learning[J]. Acta Acustica,2020,45(4):506-514.
[13] LEE S Y, CHANG J, LEE S. Deep learning-based method for multiple sound source localization with high resolution and accuracy[J]. Mechanical Systems and Signal Processing, 2021, 161: 107959.
[14] MA W, LIU X. Phased microphone array for sound source localization with deep learning[J]. Aerospace Systems, 2019, 2(2): 71-81.
[15] XU P, ARCONDOULIS E J G, LIU Y. Acoustic source imaging using densely connected convolutional networks[J]. Mechanical Systems and Signal Processing, 2021, 151: 107370.
[16] LEE S Y, CHANG J, LEE S. Deep Learning-Enabled High-Resolution and Fast Sound Source Localization in Spherical Microphone Array System[J]. IEEE Transactions on Instrumentation and Measurement, 2022, 71: 1-12.
[17] KUJAWSKI A, HEROLD G, SARRADJ E. A deep learning method for grid-free localization and quantification of sound sources[J]. The Journal of the Acoustical Society of America, 2019, 146(3): EL225-EL231.
[18] LEE S Y, LEE S, JUNG J H. Acoustic Source Localization for Single Point Source using Convolutional Neural Network and Weighted Frequency Loss[C]//INTER-NOISE and NOISE-CON Congress and Conference Proceedings. Institute of Noise Control Engineering, 2020, 261(1): 5674-5681.
[19] CASTELLINI P, GIULIETTI N, FALCIONELLI N, et al. A neural network based microphone array approach to grid-less noise source localization[J]. Applied Acoustics, 2021, 177: 107947.
[20] SARRADJ E. Three-dimensional acoustic source mapping with different beamforming steering vector formulations[J]. Advances in Acoustics and Vibration, 2012, 2012.
[21] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2016: 770-778.
[22] HE K, ZHANG X, REN S, et al. Identity mappings in deep residual networks[C]//European conference on computer vision. Springer, Cham, 2016: 630-645.
[23] KINGMA D P, BA J. Adam: A method for stochastic optimization[J]. arXiv preprint arXiv:1412.6980, 2014.
[24] A. KUJAWSKI. Acoupipe, GitHub Repository, 2021.
[25] SARRADJ E, HEROLD G. A Python framework for microphone array data processing[J]. Applied Acoustics, 2017, 116: 50-58.
[26] RONNEBERGER O, FISCHER P, BROX T. U-net: Convolutional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015: 234-241.