24小时热门版块排行榜    

查看: 1032  |  回复: 1

04nylxb

木虫 (正式写手)

[求助] 请问有人能够贴一个在多节点并行机上运算成功的.castep输出文件吗?

att,就是在多节点cluster上,进行castep计算,成功后,会输出一个.castep的文件,我想看下一个运行成功的并行castep,输出结果都包含哪些结果,呵呵。
我自己弄的并行,现在还在跑着,发现当processor使用少于24个的时候,任务能动,但是processor多于24个,马上就failure了 (我共60个processors,每个节点4个processor)。我发现,多节点速度很慢啊。上午到现在,优化一个最简单的C的单胞,一个点都还没出来,用top查看,发现有几个节点cpu确实有的,castep.exe也在运行着。
因此想看下一个成功的并行CASTEP任务,输出结果会是啥样的,是否会告诉任务都在哪些节点上跑着?
非常感谢。
或者发我邮箱,[email]04nylxb@zju.edu.cn

(我共1个master node,15个计算node,机子比较老,每个节点内存只有2G,CPU是Intel(R) Xeon(TM) CPU 2.80GHz  )

我每个节点一个一个测试,(每次运行8个processor,一个master节点,加一个计算节点),测试下来,每个节点mpi运行正常,都能正常运算,但是一旦把所有节点都加进去,dmol能够用到24个processor,而CASTEP则不能,一旦多余24个processor,任务就马上失败了,……求指点。
CASTEP出这样的错误提示:
his version was compiled for linux on Nov 13 2008

License checkout of MS_castep successful


Pseudo atomic calculation performed for C 2s2 2p2

Converged in 17 iterations to a total energy of -145.8146 eV

Plane wave load balancing: max      0 min      0 average      0
Error basis_count_plane_waves: need to have at least 1 plane wave on each node
Current trace stack:
basis_count_plane_waves
basis_initialise
castep
Plane wave load balancing: max      0 min      0 average      0
Error basis_count_plane_waves: need to have at least 1 plane wave on each node
Current trace stack:
basis_count_plane_waves
basis_initialise
castep
Plane wave load balancing: max      0 min      0 average      0
Error basis_count_plane_waves: need to have at least 1 plane wave on each node
Current trace stack:
basis_count_plane_waves
basis_initialise
castep
Plane wave load balancing: max      0 min      0 average      0
Error basis_count_plane_waves: need to have at least 1 plane wave on each node
Current trace stack:
basis_count_plane_waves
basis_initialise
castep
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
Plane wave load balancing: max      0 min      0 average      0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 3 pid 23954 on host master to cpu 3
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 1 pid 23952 on host master to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 11 pid 13852 on host node2 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 17 pid 22591 on host node4 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 16 pid 22590 on host node4 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 15 pid 12642 on host node3 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 13 pid 12640 on host node3 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 22 pid 10715 on host node5 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 9 pid 13850 on host node2 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 10 pid 13851 on host node2 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 8 pid 13849 on host node2 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 0 pid 23951 on host master to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 2 pid 23953 on host master to cpu 2
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 21 pid 10714 on host node5 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 23 pid 10716 on host node5 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 18 pid 22592 on host node4 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 19 pid 22593 on host node4 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 14 pid 12641 on host node3 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 12 pid 12639 on host node3 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 20 pid 10713 on host node5 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 4 pid 29981 on host node1 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 6 pid 29983 on host node1 to cpu 0
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 7 pid 29984 on host node1 to cpu 1
MPI_CPU_AFFINITY set to RANK, setting affinity of rank 5 pid 29982 on host node1 to cpu 1
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
warning:regcache incompatible with malloc
MPI Application rank 3 exited before MPI_Finalize() with status 1
forrtl: severe (40): recursive I/O operation, unit 10, file unknown
Image              PC        Routine            Line        Source            
castepexe_mpi.exe  0957C173  Unknown               Unknown  Unknown
castepexe_mpi.exe  0957B793  Unknown               Unknown  Unknown
castepexe_mpi.exe  0953041A  Unknown               Unknown  Unknown
castepexe_mpi.exe  094F0CD4  Unknown               Unknown  Unknown
castepexe_mpi.exe  09525D9E  Unknown               Unknown  Unknown
castepexe_mpi.exe  080DB666  Unknown               Unknown  Unknown
castepexe_mpi.exe  094EBD37  Unknown               Unknown  Unknown
castepexe_mpi.exe  0950280A  Unknown               Unknown  Unknown
castepexe_mpi.exe  095251B5  Unknown               Unknown  Unknown
castepexe_mpi.exe  084AC8E3  Unknown               Unknown  Unknown
castepexe_mpi.exe  084B4575  Unknown               Unknown  Unknown
castepexe_mpi.exe  08F5DF3A  Unknown               Unknown  Unknown
castepexe_mpi.exe  080503A5  Unknown               Unknown  Unknown
libc.so.6          003D6E9C  Unknown               Unknown  Unknown
castepexe_mpi.exe  080502E1  Unknown               Unknown  Unknown
forrtl: severe (40): recursive I/O operation, unit 10, file unknown
Image              PC        Routine            Line        Source            
castepexe_mpi.exe  0957C173  Unknown               Unknown  Unknown
castepexe_mpi.exe  0957B793  Unknown               Unknown  Unknown
castepexe_mpi.exe  0953041A  Unknown               Unknown  Unknown
castepexe_mpi.exe  094F0CD4  Unknown               Unknown  Unknown
castepexe_mpi.exe  09525D9E  Unknown               Unknown  Unknown
castepexe_mpi.exe  080DB666  Unknown               Unknown  Unknown
castepexe_mpi.exe  094EBD37  Unknown               Unknown  Unknown
castepexe_mpi.exe  0950280A  Unknown               Unknown  Unknown
castepexe_mpi.exe  095251B5  Unknown               Unknown  Unknown
castepexe_mpi.exe  084AC8E3  Unknown               Unknown  Unknown
castepexe_mpi.exe  084B4575  Unknown               Unknown  Unknown
castepexe_mpi.exe  08F5DF3A  Unknown               Unknown  Unknown
castepexe_mpi.exe  080503A5  Unknown               Unknown  Unknown
libc.so.6          003D6E9C  Unknown               Unknown  Unknown
castepexe_mpi.exe  080502E1  Unknown               Unknown  Unknown
forrtl: severe (40): recursive I/O operation, unit 10, file unknown
Image              PC        Routine            Line        Source            
castepexe_mpi.exe  0957C173  Unknown               Unknown  Unknown
castepexe_mpi.exe  0957B793  Unknown               Unknown  Unknown
castepexe_mpi.exe  0953041A  Unknown               Unknown  Unknown
castepexe_mpi.exe  094F0CD4  Unknown               Unknown  Unknown

[ Last edited by 04nylxb on 2011-6-23 at 15:46 ]
回复此楼
集中精力发文章
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

04nylxb

木虫 (正式写手)

att,求一个成功的并行计算CASTEP输出结果文件,主要看下成功的并行计算应该是啥样子的.castep文件,呵呵。
集中精力发文章
2楼2011-06-24 10:34:45
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
相关版块跳转 我要订阅楼主 04nylxb 的主题更新
信息提示
请填处理意见