| 查看: 1415 | 回复: 5 | |||
| 当前主题已经存档。 | |||
imation铁杆木虫 (正式写手)
|
[交流]
castep多核运算的问题
|
||
|
大家好,我有一台服务器,cpu为至强clovertown双路四核,每个核心频率是2ghz,内存667,4条,单条1g,操作系统装过2003中文版,sp1和sp2,也装过英文版sp1,2008英文版sp1 我尝试着运行tutorial里面的co分子adsorption在pb表面,一个核心运行很正常,1个小时就有结果,可是8个核心就总是失败,出现的问题也是千奇百怪: 1)进程castep,共八个,可是运行一会儿以后,自动减少到5个,有一次甚至减少到1个,内存无释放,有时任务失败,有时一直运算,但结果遥遥无期 2)cpu利用率在开始计算时为满负荷,经过1到5分钟,降到1~3%,内存没有释放,过10分钟左右,任务失败 3)尝试改变核心数目,发现1到3个都可以正常运行得到结果,超过3个就失败 4)还有n多错误,其中出现最多一个(每次都是在第一个点时出现),代码如下: *Warning* max. SCF cycles performed but system has not reached the groundstate. Current total energy, E = -5951.893478460 eV Current free energy (E-TS) = -5951.990514506 eV (energies not corrected for finite basis set) NB est. 0K energy (E-0.5TS) = -5951.941996483 eV **************************************************************************** Warning: electronic minimisation did not converge when finding ground state. **************************************************************************** Writing model to 1.check Error in geom_get_forces - electronic_minimisation of current_cell failed Error in geom_get_forces - electronic_minimisation of current_cell failed Error in geom_get_forces - electronic_minimisation of current_cell failed Error in geom_get_forces - electronic_minimisation of current_cell failed [1] MPI Abort by user Aborting program ! [1] Aborting program! [2] MPI Abort by user Aborting program ! [2] Aborting program! [0] MPI Abort by user Aborting program ! [0] Aborting program! forrtl: severe (47): write to READONLY file, unit 60, file D:\PROGRA~1\Accelrys\MATERI~1.1\Gateway\root_default\dsd\jobs\4GLA1\killfile Image PC Routine Line Source castepexe_mpi.exe 00AB6CC2 Unknown Unknown Unknown castepexe_mpi.exe 00AB3F50 Unknown Unknown Unknown castepexe_mpi.exe 00A42B9E Unknown Unknown Unknown castepexe_mpi.exe 00A427BB Unknown Unknown Unknown castepexe_mpi.exe 00A246D1 Unknown Unknown Unknown castepexe_mpi.exe 009F8E7B Unknown Unknown Unknown castepexe_mpi.exe 009F8EAF Unknown Unknown Unknown castepexe_mpi.exe 004FB352 Unknown Unknown Unknown castepexe_mpi.exe 004E5281 Unknown Unknown Unknown castepexe_mpi.exe 004DC8D0 Unknown Unknown Unknown castepexe_mpi.exe 00402353 Unknown Unknown Unknown castepexe_mpi.exe 00ABE578 Unknown Unknown Unknown castepexe_mpi.exe 00A904BB Unknown Unknown Unknown kernel32.dll 7C82F23B Unknown Unknown Unknown 我尝试增加maximum iterations到1000,Max SCF cycles到5000,八核并行运算,算了2天半,正常,我stop运算,在输出中发现还是有些错误: WARNING - user ionic constraints and symmetry specified - symmetry has precedence over constraints - may lead to a conflict? HINT - if convergence fails try switching symmetry OFF BFGS: Warning - trial step suggests complex energy landscape in which simple line minimization will fail. - This is usually an indication that the forces/streses are not accurate enough. Consider increasing the cutoff energy and/or the electronic convergence tolerance. - Proceeding with a bisection search to find root instead. Warning: There are no empty bands for at least one kpoint and spin; this may slow the convergence and/or lead to an inaccurate groundstate. If this warning persists, you should consider increasing nextra_bands and/or reducing smearing_width in the param file. Recommend using nextra_bands of 14 to 29. BFGS: Warning - Repeated consecutive reset of inverse Hessian BFGS: without satisfying convergence criteria which BFGS: looks like BFGS has run out of search directions. BFGS: Warning - Lets try allowing some uphill steps and see if BFGS: we can get around this barrier. BFGS: Warning - It is possible that the system may now converge to BFGS: a stationary point OTHER than the desired minimum. BFGS: Hint - this may be an indication that either: BFGS: a) you are using a poor guess at geom_frequency_est BFGS: and/or geom_modulus_est, or BFGS: b) you are using unrealistic convergence criteria. BFGS: Suggest therefore that you consider changing them! 等等,就不一一列出了 我的问题是,为何单核1个小时就能算出来的project,多核要么不能运算,要么运算起来比单核还要慢得多?是不是多核并行处理的时候数据交换有问题啊? 谢谢,不知道描叙清楚了没有 |
» 猜你喜欢
拟解决的关键科学问题还要不要写
已经有8人回复
存款400万可以在学校里躺平吗
已经有28人回复
最失望的一年
已经有11人回复
求推荐英文EI期刊
已经有5人回复
请教限项目规定
已经有4人回复
国自然申请面上模板最新2026版出了吗?
已经有20人回复
26申博
已经有3人回复
基金委咋了?2026年的指南还没有出来?
已经有10人回复
基金申报
已经有6人回复
疑惑?
已经有5人回复
» 本主题相关商家推荐: (我也要在这里推广)
★ ★
csfn(金币+2,VIP+0):感谢积极的交流 :-) 欢迎常来
csfn(金币+2,VIP+0):感谢积极的交流 :-) 欢迎常来
|
多个cpu运行的话,容易出错。只要其中一个进程出错,计算就不会有结果 此时其他进程仍在运行,这可能就是为什么1个cpu1个小时算完,多cpu运行算很久没有结果的原因。 另外,在多个cpu运行过程中,上面还有很多warning信息 1。HINT - if convergence fails try switching symmetry OFF 这里可能是限制对称性,导致不好收敛,或者给原子设置了constrain,有时候收敛困难,可以在计算的时候去掉symmetry,试试看。 2。BFGS: Warning - trial step suggests complex energy landscape in which simple line minimization will fail. 平面波cutoff设置可能过小。 3。empty band空带数不足,如果体系是金属性的话,不容易收敛。 |
2楼2007-12-07 21:25:38
imation
铁杆木虫 (正式写手)
- 应助: 0 (幼儿园)
- 金币: 7082.3
- 散金: 2
- 红花: 1
- 帖子: 829
- 在线: 345.4小时
- 虫号: 20276
- 注册: 2003-08-02
- 专业: 生物医用高分子材料
3楼2007-12-07 22:26:34
cometring
木虫 (著名写手)
爱你就等于爱自己
- 应助: 0 (幼儿园)
- 贵宾: 1.03
- 金币: 4315.7
- 帖子: 2911
- 在线: 36.1小时
- 虫号: 195508
- 注册: 2006-02-23
- 性别: GG
- 专业: 催化化学

4楼2007-12-08 14:48:52
fah
铁杆木虫 (著名写手)
- 应助: 28 (小学生)
- 金币: 6301.8
- 红花: 1
- 帖子: 1380
- 在线: 238.6小时
- 虫号: 332256
- 注册: 2007-03-26
- 性别: GG
- 专业: 无机非金属类高温超导与磁
5楼2007-12-09 10:02:33
6楼2008-10-19 00:23:30













回复此楼