24小时热门版块排行榜    

CyRhmU.jpeg
查看: 2472  |  回复: 6

nehcevol

铁虫 (初入文坛)

[求助] 求助VASP能带计算的中断原因

我用的两个节点并行计算。发现算能带结构时,到最后一步总是出错,是否是因为最后写入波函数需要大内存,导致退出呢?

具体log如下所示:
CODE:
running on    8 nodes
distr:  one band on    1 nodes,    8 groups
vasp.5.2.11 18Jan11 complex
POSCAR found type information on POSCAR  N  O  Ti  Ag
POSCAR found :  4 types and      88 ions
LDA part: xc-table for Pade appr. of Perdew
found WAVECAR, reading the header
  number of k-points has changed, file:     8 present:    77
  trying to continue reading WAVECAR, but it might fail
WARNING: stress and forces are not correct
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...(           1 )
reading WAVECAR
the WAVECAR file was read sucessfully
charge-density read from file: LTO                                    
entering main loop
       N       E                     dE             d eps       ncg     rms          rms(c)
DAV:   1    -0.593023659711E+03   -0.59302E+03   -0.34000E+03 84456   0.150E+02
DAV:   2    -0.696176042859E+03   -0.10315E+03   -0.10271E+03121096   0.732E+01
DAV:   3    -0.707549339083E+03   -0.11373E+02   -0.11351E+02130512   0.290E+01
DAV:   4    -0.709405593808E+03   -0.18563E+01   -0.18554E+01129992   0.110E+01
DAV:   5    -0.709801950045E+03   -0.39636E+00   -0.39623E+00130392   0.527E+00
DAV:   6    -0.709899073522E+03   -0.97123E-01   -0.97112E-01129904   0.229E+00
DAV:   7    -0.709923702852E+03   -0.24629E-01   -0.24627E-01129744   0.130E+00
DAV:   8    -0.709930226273E+03   -0.65234E-02   -0.65231E-02128248   0.553E-01
DAV:   9    -0.709931959331E+03   -0.17331E-02   -0.17330E-02121408   0.331E-01
DAV:  10    -0.709932424620E+03   -0.46529E-03   -0.46528E-03108832   0.144E-01
DAV:  11    -0.709932538124E+03   -0.11350E-03   -0.11350E-03 80032   0.920E-02
DAV:  12    -0.709932563609E+03   -0.25485E-04   -0.25484E-04 58808   0.535E-02
rank 4 in job 3  dock14_43359   caused collective abort of all ranks
  exit status of rank 4: killed by signal 9
rank 0 in job 3  dock14_43359   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9


而在OUTCAR的结尾,则可以看到如下信息
CODE:
    441      11.5540      0.00000
    442      11.5669      0.00000
    443      11.5885      0.00000
    444      11.6014      0.00000
    445      11.6189      0.00000
    446      11.6371      0.00000
    447      11.6554      0.00000
    448      11.7255      0.00000


--------------------------------------------------------------------------------------------------------


soft charge-density along one line, spin component           1
         0         1         2         3         4         5         6         7         8         9
total charge-density along one line

pseudopotential strength for first ion, spin component:           1
12.429   0.016   0.000   0.000   0.000   0.000   0.000   0.000
  0.016   0.002   0.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000  -5.870   0.004   0.000   0.251   0.001   0.000
  0.000   0.000   0.004  -5.850   0.001   0.001   0.255   0.000
  0.000   0.000   0.000   0.001  -5.870   0.000   0.000   0.250
  0.000   0.000   0.251   0.001   0.000   0.953   0.000   0.000
  0.000   0.000   0.001   0.255   0.000   0.000   0.953   0.000
  0.000   0.000   0.000   0.000   0.250   0.000   0.000   0.953
total augmentation occupancy for first ion, spin component:           1
  2.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000   1.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000   0.000   1.000   0.000   0.000   0.000   0.000
  0.000   0.000   0.000   0.000   1.000   0.000   0.000   0.000
  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000
  0.000   0.000   0.000   0.000   0.000   0.000   0.000   0.000


------------------------ aborting loop because EDIFF is reached ----------------------------------------



[ Last edited by nehcevol on 2013-1-26 at 13:21 ]
回复此楼

» 猜你喜欢

» 本主题相关价值贴推荐,对您同样有帮助:

已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

llh2010

至尊木虫 (著名写手)

【答案】应助回帖

★ ★ ★ ★
感谢参与,应助指数 +1
nehcevol: 金币+2, ★★★很有帮助, 谢谢帮助 2013-01-28 09:05:17
fzx2008: 金币+2, 谢谢指教! 2013-01-28 09:54:33
很有可能。我也碰到过算到最后一步被kill的情况,后来换了更多的节点,就不会出这个问题了。
知识引导人生,学习成就未来
2楼2013-01-28 09:02:14
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

fzx2008

荣誉版主 (著名写手)

优秀版主优秀版主

【答案】应助回帖

★ ★
感谢参与,应助指数 +1
franch: 金币+2, 谢谢回帖交流,, 2013-01-28 22:31:26
应该是内存不足,没有写入EIGENVAL文件

不过也没有关系,实际上此时的能量本征值都写入了OUTCAR了。也很容易从这儿提取能带的信息。
3楼2013-01-28 09:54:11
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

99098585

银虫 (初入文坛)

我也遇到类似情况,发现服务器内存没有用到多少?是不是并行环境有问题?谢谢!

DA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...(           1 )
WAVECAR not read
entering main loop
       N       E                     dE             d eps       ncg     rms          rms(c)
rank 14 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 14: killed by signal 9
rank 12 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 12: killed by signal 9
rank 35 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 35: killed by signal 9
rank 29 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 29: killed by signal 9
rank 28 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 28: killed by signal 9
rank 27 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 27: killed by signal 9
rank 26 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 26: killed by signal 9
rank 25 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 25: killed by signal 11
rank 24 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 24: killed by signal 11
rank 47 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 47: killed by signal 11
rank 46 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 46: killed by signal 9
rank 45 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 45: killed by signal 9
rank 38 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 38: killed by signal 9
rank 37 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 37: killed by signal 9
rank 36 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 36: killed by signal 11
rank 11 in job 1  node1_56151   caused collective abort of all ranks
  exit status of rank 11: killed by signal 9
4楼2013-01-28 23:45:25
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

西山一

新虫 (小有名气)

请问你的这个问题解决了吗?
5楼2013-01-29 18:45:56
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

liutaifeng

新虫 (小有名气)

这个问题到底有什么引起的呢  是内从不足在 还是结构有问题 还是设置上有问题
6楼2013-01-31 14:26:58
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

jiaer87

新虫 (小有名气)

引用回帖:
4楼: Originally posted by 99098585 at 2013-01-28 23:45:25
我也遇到类似情况,发现服务器内存没有用到多少?是不是并行环境有问题?谢谢!

DA part: xc-table for Pade appr. of Perdew
POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap ...

请问,你这个解决了么?
直挂云帆济沧海
7楼2022-07-26 10:05:17
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
相关版块跳转 我要订阅楼主 nehcevol 的主题更新
信息提示
请填处理意见