| 查看: 2839 | 回复: 3 | ||
[求助]
“天河二号”上计算cl-neb出错 已有1人参与
|
|
我在“天河二号”上计算cl-neb,刚开始我算的一个任务IMAGES设的是1,可以顺利算完,但我把IMAGES改成5后算两步不到一分钟就出错退出,我想是否跟我并行的核数有关,之前看小木虫上有前辈说到计算的cpu核心数需要是IMAGES的整数倍,(“天河二号”上一个节点是24核)于是我改成一个节点、并行20核,还有2个节点、并行40核,结果还是一样的出错退出计算,以下是slurm-174172.out里面的内容,投的几次都是一样的出错信息,求大神帮忙支招。 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 M_divide: can not subdivide 24 nodes by 5 Fatal error in MPI_Topo_test: Invalid communicator, error stack: MPI_Topo_test(125): MPI_Topo_test(MPI_COMM_NULL, topo_type=0x7fff3b557f04) failed MPI_Topo_test(76).: Null communicator Fatal error in MPI_Topo_test: Invalid communicator, error stack: MPI_Topo_test(125): MPI_Topo_test(MPI_COMM_NULL, topo_type=0x7fffc1f20c84) failed MPI_Topo_test(76).: Null communicator Fatal error in MPI_Topo_test: Invalid communicator, error stack: MPI_Topo_test(125): MPI_Topo_test(MPI_COMM_NULL, topo_type=0x7fff7ae64b04) failed MPI_Topo_test(76).: Null communicator Fatal error in MPI_Topo_test: Invalid communicator, error stack: MPI_Topo_test(125): MPI_Topo_test(MPI_COMM_NULL, topo_type=0x7fffe7806a04) failed MPI_Topo_test(76).: Null communicator yhrun: error: cn7699: tasks 20-23: Exited with exit code 1 yhrun: First task exited 60s ago yhrun: tasks 0-19: running yhrun: tasks 20-23: exited abnormally yhrun: Terminating job step 174172.0 slurmd[cn7699]: *** STEP 174172.0 KILLED AT 2016-04-01T20:45:58 WITH SIGNAL 9 *** yhrun: Job step aborted: Waiting up to 2 seconds for job step to finish. slurmd[cn7699]: *** STEP 174172.0 KILLED AT 2016-04-01T20:45:58 WITH SIGNAL 9 *** yhrun: error: cn7699: tasks 0-19: Killed |
» 猜你喜欢
这年头没有找到涵评专家,还有中面上的可能吗
已经有5人回复
2026博士申请求助
已经有10人回复
评审感受-评审感受-评审感受
已经有12人回复
西南大学考核制博士
已经有6人回复
窗边初夏的小雨
已经有10人回复
护理论文 晋升
已经有4人回复
求碳排放博导;方向是LCA、生命周期可持续发展以及碳排放
已经有8人回复
26年申博自荐-计算机视觉
已经有5人回复
导师各种操作恶心咋办
已经有12人回复
现在不知道怎么办,感觉很痛苦
已经有5人回复

andavid2007
新虫 (正式写手)
- 应助: 1 (幼儿园)
- 金币: 7464
- 红花: 4
- 帖子: 598
- 在线: 141.1小时
- 虫号: 3982645
- 注册: 2015-07-21
- 专业: 理论和计算化学
2楼2016-04-01 22:45:05
jackjiejl
木虫 (小有名气)
- 应助: 1 (幼儿园)
- 金币: 2490
- 散金: 127
- 帖子: 128
- 在线: 310小时
- 虫号: 2719699
- 注册: 2013-10-12
- 性别: GG
- 专业: 催化化学

3楼2016-04-02 12:27:58
冰凌sunshine
金虫 (小有名气)
- 应助: 1 (幼儿园)
- 金币: 1386.5
- 帖子: 86
- 在线: 187.7小时
- 虫号: 2840540
- 注册: 2013-12-01
- 性别: MM
- 专业: 凝聚态物性 II :电子结构
4楼2018-01-17 11:49:01












回复此楼