申威异构众核处理器架构下结构瞬态有限元并行算法

喻高远1,2,楼云锋1,2,3,李俊杰1,2,金先龙1,2

振动与冲击 ›› 2023, Vol. 42 ›› Issue (6) : 152-158.

PDF(2282 KB)
PDF(2282 KB)
振动与冲击 ›› 2023, Vol. 42 ›› Issue (6) : 152-158.
论文

申威异构众核处理器架构下结构瞬态有限元并行算法

  • 喻高远1,2,楼云锋1,2,3,李俊杰1,2,金先龙1,2
作者信息 +

Parallel algorithms for structure transient analysis based on heterogeneous multi-core processor architecture

  • YU Gaoyuan1,2,LOU Yunfeng1,2,3,LI Junjie1,2,JIN Xianlong1,2
Author information +
文章历史 +

摘要

根据国产申威异构众核分布式存储计算机的体系结构特点,提出了一种结构瞬态有限元分层并行计算方法,对于提高国产申威异构众核分布式存储并行计算机下大型、超大型复杂结构系统的瞬态并行求解效率具有重要意义。该方法在分层通信和Newmark-HHT算法的基础上构建了大规模复杂结构系统的瞬态并行求解体系,不仅实现了计算过程中大量数据的分布式存储,显著改善了数据的内存访存效率;而且实现了计算过程的两层并行,有效改善了通信效率。因此,它能够充分利用国产申威异构众核分布式存储并行计算机的体系结构特点提升结构瞬态大规模并行计算效率。最后通过典型数值算例验证了该方法的正确性和有效性,并将其应用于某高层建筑,实现其上千万自由度、数万核的结构瞬态并行计算。

Abstract

According to the architecture characteristics of domestic heterogeneous multi-core processor, a hierarchical communication parallel computing algorithm for structural transient analysis is proposed, which has important significance to improve the parallel efficiency of the system transient analysis on the entire large structure under the domestic heterogeneous multi-core and distributed memory parallel computers. Based on hierarchical communication and Newmark-HHT algorithm, a parallel computing system for a large-scale transient analysis was established, which can not only significantly improve memory access rate through the distributed storage of a large amount of data, but also significantly improve communication rate with the two-layer parallelization of the computational procedure. Hence, it can improve the efficiency rates of parallel computing of large-scale transient analysis by fully exploiting the architecture characteristics of the domestic heterogeneous multi-core and distributed memory parallel computers. Finally, typical numerical experiments were used to validate the correctness and efficiency of the proposed method. Then a parallel transient analysis of the high-rise building with over ten-million-DOF was performed and ten thousands of core processors were applied.

关键词

异构众核  / 分布式存储  / 分层通信 大规模瞬态分析  / 并行计算

Key words

Heterogeneous multi-core  / Distributed memory parallel computer  / Hierarchical communication  / Large-scale vibration analysis  / Parallel computation

引用本文

导出引用
喻高远1,2,楼云锋1,2,3,李俊杰1,2,金先龙1,2. 申威异构众核处理器架构下结构瞬态有限元并行算法[J]. 振动与冲击, 2023, 42(6): 152-158
YU Gaoyuan1,2,LOU Yunfeng1,2,3,LI Junjie1,2,JIN Xianlong1,2. Parallel algorithms for structure transient analysis based on heterogeneous multi-core processor architecture[J]. Journal of Vibration and Shock, 2023, 42(6): 152-158

参考文献

[1] 苗新强,金先龙,丁峻宏.结构静力有限元分层并行计算方法.力学学报,2014,46(4):611-618.
Miao XQ, Jin XL, Ding JH. A hierarchical parallel computing approach for structural static finite element analysis. Acta Mechanica Sinica, 2014, 46(4):611-618 (in Chinese with English abstract).
[2] 李雁冰,赵荣彩,韩林,赵捷,徐金龙,李颖颖. 一种面向异构众核处理器的并行编译框架.软件学报,2019,30(4):981-1001.http://www.jos.org.cn/1000-9825/5370.htm
Li YB, Zhao RC, Han L, Zhao J, Xu JL, Li YY. Parallelizing compilation framework for heterogeneous many-core processors. Ruan Jian Xue Bao/Journal of Software, 2019,30(4):981-1001 (in Chinese).
[3] Seid Koric, Qiyue Lu, Erman Guleryuz. Evaluation of massively parallel linear sparse solvers on unstructured finite element meshes. Computers and Structures, 2014, 141:19-25.
[4] 刘颖,黄磊,吕方,崔慧敏,王蕾,冯晓兵.异构架构下基于放松重用距离的多平台数据布局优化.软件学报,2016, 27(8):2168−2184.
Liu Y, Huang L, Lü F, Cui HM, Wang L, Feng XB. Cross-Platform data layout optimization based on relaxed reuse distance on heterogeneous architectures. Ruan Jian Xue Bao/Journal of Software, 2016,27(8):2168−2184 (in Chinese).
[5] Daga M, Aji AM, Feng W. On the efficacy of a fused CPU+ GPU processor (or APU) for parallel computing[C]// 2011 Symposium on Application Accelerators in High-Performance Computing. Knoxville, TN:IEEE,2011.
[6] Keckler SW, Dally WJ, Khailany B. GPUs and the future of parallel computing. IEEE Micro, 2011,31:7–17.
[7] Carter NP, Agrawal A, Borkar S. Runnemede: An architecture for ubiquitous high-performance computing C]// 2013 IEEE 19th International Symposium on High Performance Computer Architecture. Shenzhen: IEEE, 2013.
 [8] 刘芳芳,杨超,袁欣辉,吴长茂,敖玉龙.面向国产申威26010 众核处理器的SpMV 实现与优化.软件学报,2018,29(12): 3921−3932.
Liu FF, Yang C, Yuan XH, Wu CM, Ao YL. General SpMV implementation in many-core domestic sunway 26010 processor. Ruan Jian Xue Bao/Journal of Software, 2018,29(12):3921−3932 (in Chinese).
[9] Tuan Ta, Kyoshin Choo, Eh Tan, Byunghyun Jang, Eunseo Choi. Accelerating DynEarthSol3D on tightly coupled CPU-GPU heterogeneous processors [J]. Computers & Geosciences, 2015, 79:27-37.
[10] Cai Yong, Li Guangyao, Liu Wenyang. Parallelized implementation of an explicit finite element method in many integrated core (MIC) architecture [J]. Advances in Engineering Software, 2018(116): 50-59.
[11] Xinqiang Miao, Xianlong Jin, Junhong Ding. An approach to enhance the performance of structural analysis on CPU-MIC heterogeneous clusters [J]. Concurrency and Computation-practice & Experience, 2017, 29(8): e4033.
[12] 李芳, 李志辉, 徐金秀, 范昊, 褚学森, 李新亮.基于十亿亿次国产超算系统的流体力学软件众核适应性研究[J]. 计算机科学, 2020, 1:24-30.
Li Fang, Li Zhihui, Xu Jinxiu, Fan Hao, Xu JL, Li Xinlaing. Research on Adaptation of CFD Software Based on Many-core Architecture of 100P Domestic Supercomputing System [J]. Computer Science, 2020, 1:24-30.
[13] B.C.P. Heng, R.I. Mackie. Parallel modal analysis with concurrent distributed objects[J]. Computers and Structures, 2010,88:1444-1458.
[14] 范宣化, 王柯颖, 肖世富, 陈璞.强脉动压力下的飞行器随机振动分析算法与并行[J]. 计算物理, 2020,9:1-7.
Fan Xuanhua, Wang keying, Xiao Shifu, Chen pu. Algorithm and Parallel Implementation of Multi-point Random Vibration Analysis of Flight Device under Strong Fluctuating Pressures [J]. Chinese Journal of Computational physics, 2020, 9:1-7.
[15] 王帅霖, 刘杜文, 季顺迎.基于GPU并行的锥体导管架平台结构冰激瞬态DEM-FEM耦合分析[J]. 工程力学, 2019,36(10): 28-39.
Wang Shuailin, Liu Duwen, Ji Shunying. Coupled discrete-finite element analysis for ice-induced vibration of conical jacket platform based on GPU-Based parallel algorithm. Engineering Mechanics, 2019, 36(10):28-39.
[16] Klaus Jurgen Bathe, Gunwoo Noh. Insight intoan implicit time integration scheme for structure dynamics [J]. Computers and Structures, 2012, 98-99:1-6.
[17] Hughes T JR. The Finite Element Method Linear Static and Dynamic Finite Element Analysis [M].NJ: Prentice-Hall, Inc., Englewood Cliffs, 1987.
[18] Da Chen, Jie Yang, Sritawat Kitipornchai. Free and forced vibrations of shear deformable functionally graded porous beams [J]. International Journal of Mechanical Sciences, 2016,108-109:14-22.

PDF(2282 KB)

Accesses

Citation

Detail

段落导航
相关文章

/