Abstract:According to the architecture characteristics of domestic heterogeneous multi-core processor, a hierarchical communication parallel computing algorithm for structural transient analysis is proposed, which has important significance to improve the parallel efficiency of the system transient analysis on the entire large structure under the domestic heterogeneous multi-core and distributed memory parallel computers. Based on hierarchical communication and Newmark-HHT algorithm, a parallel computing system for a large-scale transient analysis was established, which can not only significantly improve memory access rate through the distributed storage of a large amount of data, but also significantly improve communication rate with the two-layer parallelization of the computational procedure. Hence, it can improve the efficiency rates of parallel computing of large-scale transient analysis by fully exploiting the architecture characteristics of the domestic heterogeneous multi-core and distributed memory parallel computers. Finally, typical numerical experiments were used to validate the correctness and efficiency of the proposed method. Then a parallel transient analysis of the high-rise building with over ten-million-DOF was performed and ten thousands of core processors were applied.
[1] 苗新强,金先龙,丁峻宏.结构静力有限元分层并行计算方法.力学学报,2014,46(4):611-618.
Miao XQ, Jin XL, Ding JH. A hierarchical parallel computing approach for structural static finite element analysis. Acta Mechanica Sinica, 2014, 46(4):611-618 (in Chinese with English abstract).
[2] 李雁冰,赵荣彩,韩林,赵捷,徐金龙,李颖颖. 一种面向异构众核处理器的并行编译框架.软件学报,2019,30(4):981-1001.http://www.jos.org.cn/1000-9825/5370.htm
Li YB, Zhao RC, Han L, Zhao J, Xu JL, Li YY. Parallelizing compilation framework for heterogeneous many-core processors. Ruan Jian Xue Bao/Journal of Software, 2019,30(4):981-1001 (in Chinese).
[3] Seid Koric, Qiyue Lu, Erman Guleryuz. Evaluation of massively parallel linear sparse solvers on unstructured finite element meshes. Computers and Structures, 2014, 141:19-25.
[4] 刘颖,黄磊,吕方,崔慧敏,王蕾,冯晓兵.异构架构下基于放松重用距离的多平台数据布局优化.软件学报,2016, 27(8):2168−2184.
Liu Y, Huang L, Lü F, Cui HM, Wang L, Feng XB. Cross-Platform data layout optimization based on relaxed reuse distance on heterogeneous architectures. Ruan Jian Xue Bao/Journal of Software, 2016,27(8):2168−2184 (in Chinese).
[5] Daga M, Aji AM, Feng W. On the efficacy of a fused CPU+ GPU processor (or APU) for parallel computing[C]// 2011 Symposium on Application Accelerators in High-Performance Computing. Knoxville, TN:IEEE,2011.
[6] Keckler SW, Dally WJ, Khailany B. GPUs and the future of parallel computing. IEEE Micro, 2011,31:7–17.
[7] Carter NP, Agrawal A, Borkar S. Runnemede: An architecture for ubiquitous high-performance computing C]// 2013 IEEE 19th International Symposium on High Performance Computer Architecture. Shenzhen: IEEE, 2013.
[8] 刘芳芳,杨超,袁欣辉,吴长茂,敖玉龙.面向国产申威26010 众核处理器的SpMV 实现与优化.软件学报,2018,29(12): 3921−3932.
Liu FF, Yang C, Yuan XH, Wu CM, Ao YL. General SpMV implementation in many-core domestic sunway 26010 processor. Ruan Jian Xue Bao/Journal of Software, 2018,29(12):3921−3932 (in Chinese).
[9] Tuan Ta, Kyoshin Choo, Eh Tan, Byunghyun Jang, Eunseo Choi. Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors [J]. Computers & Geosciences, 2015, 79:27-37.
[10] Cai Yong, Li Guangyao, Liu Wenyang. Parallelized implementation of an explicit finite element method in many integrated core (MIC) architecture [J]. Advances in Engineering Software, 2018(116): 50-59.
[11] Xinqiang Miao, Xianlong Jin, Junhong Ding. An approach to enhance the performance of structural analysis on CPU-MIC heterogeneous clusters [J]. Concurrency and Computation-practice & Experience, 2017, 29(8): e4033.
[12] 李芳, 李志辉, 徐金秀, 范昊, 褚学森, 李新亮.基于十亿亿次国产超算系统的流体力学软件众核适应性研究[J]. 计算机科学, 2020, 1:24-30.
Li Fang, Li Zhihui, Xu Jinxiu, Fan Hao, Xu JL, Li Xinlaing. Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System [J]. Computer Science, 2020, 1:24-30.
[13] B.C.P. Heng, R.I. Mackie. Parallel modal analysis with concurrent distributed objects[J]. Computers and Structures, 2010,88:1444-1458.
[14] 范宣化, 王柯颖, 肖世富, 陈璞.强脉动压力下的飞行器随机振动分析算法与并行[J]. 计算物理, 2020,9:1-7.
Fan Xuanhua, Wang keying, Xiao Shifu, Chen pu. Algorithm and Parallel Implementation of Multi-point Random Vibration Analysis of Flight Device under Strong Fluctuating Pressures [J]. Chinese Journal of Computational physics, 2020, 9:1-7.
[15] 王帅霖, 刘杜文, 季顺迎.基于GPU并行的锥体导管架平台结构冰激瞬态DEM-FEM耦合分析[J]. 工程力学, 2019,36(10): 28-39.
Wang Shuailin, Liu Duwen, Ji Shunying. Coupled discrete-finite element analysis for ice-induced vibration of conical jacket platform based on GPU-Based parallel algorithm. Engineering Mechanics, 2019, 36(10):28-39.
[16] Klaus Jurgen Bathe, Gunwoo Noh. Insight intoan implicit time integration scheme for structure dynamics [J]. Computers and Structures, 2012, 98-99:1-6.
[17] Hughes T JR. The Finite Element Method Linear Static and Dynamic Finite Element Analysis [M].NJ: Prentice-Hall, Inc., Englewood Cliffs, 1987.
[18] Da Chen, Jie Yang, Sritawat Kitipornchai. Free and forced vibrations of shear deformable functionally graded porous beams [J]. International Journal of Mechanical Sciences, 2016,108-109:14-22.