版块导航: 正在加载中...

登录注册

应《网络安全法》要求，自2017年10月1日起，未进行实名认证将不得使用互联网跟帖服务。为保障您的帐号能够正常使用，请尽快对帐号进行手机号验证，感谢您的理解与支持！

24小时热门版块排行榜

返回列表

sic029

铁虫 (初入文坛)

应助: 0 (幼儿园)
金币: 70.6
帖子: 7
在线: 5.4小时
虫号: 1960088
注册: 2012-08-28
性别: GG
专业: 结构陶瓷

[求助] qsub提交并行siesta不成功，求助

大家好，交流下集群程序使用遇到的问题，多谢。
[node21:10714] *** An error occurred in MPI_Comm_rank
[node21:10714] *** on communicator MPI_COMM_WORLD
[node21:10714] *** MPI_ERR_COMM: invalid communicator
[node21:10714] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 3 with PID 10711 on
node node21 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[node21:10707] 7 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[node21:10707] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages

新编译的计算程序siesta，用qsub job提交上去很快结束提示的信息，能否帮忙诊断一下情况。在另外一个集群上编译后直接用mpirun -np 4 siesta可以顺利执行的，不知道为何在新集群用qsub出现这个问题，这个新集群不让进入到子节点，所以必须要解决这个问题才行，多谢了。

不知道是哪里的问题，之前在该环境并行编译的lammps和vasp都使用很顺利，就是siesta用qsub提交作业总是无法正常计算，但是并行编译的siesta在另外环境下的子节点用mpirun -np 4 siesta执行很顺利，纠结了。

哦，登录节点上mpirun我试过的，请帮忙看看，感觉被管理员设置了也无法用：
mpirun -np 4 siesta
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
This will severely limit memory registrations.
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory.  This typically can indicate that the
memlock limits are set too low.  For most HPC installations, the
memlock limits should be set to "unlimited".  The failure occured
here:

  Local host: manage1
  OMPI source: btl_openib_component.c:1115
  Function:    ompi_free_list_init_ex_new()
  Device:       mlx4_0
  Memlock limit: 32768

You may need to consult with your system administrator to get this
problem fixed.  This FAQ entry on the Open MPI web site may also be
helpful:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host: manage1
  Local device: mlx4_0
--------------------------------------------------------------------------
[manage1:16214] *** An error occurred in MPI_Comm_rank
[manage1:16214] *** on communicator MPI_COMM_WORLD
[manage1:16214] *** MPI_ERR_COMM: invalid communicator
[manage1:16214] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 16212 on
node manage1 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[manage1:16211] 3 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem
[manage1:16211] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[manage1:16211] 3 more processes have sent help message help-mpi-btl-openib.txt / error in device init
[manage1:16211] 3 more processes have sent help message help-mpi-errors.txt / mpi_errors_are_fatal

计算子节点无法进入，被限制死了的。

这边用的pbs作业管理gridview，我用的提交脚本是：
====================================
#PBS -N test
#PBS -l nodes=1:ppn=8
#PBS -j oe
#PBS -l walltime=24:00:00

cd $PBS_O_WORKDIR
NP=`cat $PBS_NODEFILE|wc -l`
source /public/software/mpi/openmpi1.5.4-intel.sh
mpirun  -machinefile $PBS_NODEFILE -np $NP  \
/home/sw/siesta/siesta-3.1/Obj/siesta < fe.fdf | tee output
=====================================

感谢虫友帮忙，多谢。

回复此楼

» 猜你喜欢

293求调剂已经有5人回复
290求调剂已经有10人回复
285化工学硕求调剂（081700）已经有8人回复
211本，11408一志愿中科院277分，曾在中科院自动化所实习已经有5人回复
材料与化工304求B区调剂已经有8人回复
有没有大佬发小论文能带我个二作已经有4人回复
考研调剂已经有5人回复
267一志愿南京工业大学0817化工求调剂已经有6人回复
293求调剂已经有4人回复
302求调剂已经有8人回复

» 本主题相关价值贴推荐，对您同样有帮助:

research

1楼 2012-09-07 16:09:01

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

redsnowolf

银虫 (小有名气)

应助: 7 (幼儿园)
金币: 416.5
散金: 36
红花: 5
帖子: 213
在线: 689.8小时
虫号: 1332218
注册: 2011-06-27
性别: GG
专业: 半导体材料

【答案】应助回帖

★
liliangfang: 金币+1, 谢谢交流 2012-09-15 15:12:35

我前两天用vasp也出现类似问题，刚刚解决～

The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory.  This typically can indicate that the
memlock limits are set too low.  For most HPC installations, the
memlock limits should be set to "unlimited".  The failure occured
here:

  Local host: node21
  OMPI source: btl_openib_component.c:1055
  Function:    ompi_free_list_init_ex_new()
  Device:       mlx4_0
  Memlock limit: 65536

You may need to consult with your system administrator to get this
problem fixed.  This FAQ entry on the Open MPI web site may also be
helpful:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
上面那个网址里15、16、17说的挺清楚的，我的情况是在每个节点ulimit -a显示locked memory都正常，可就是出错说内存分配不正常，那个网址里说可能是登录时没有正常执行系统所设的locked memory，或者作业调度系统没有分配给应用程序足够大的内存，最后重启了一下每个节点的pbs调度系统的守护进程，问题解决了～
或者你可以在mpirun前边儿加上ulimit -l unlimited，用qsub提交下试试
希望以上信息对楼主有用～

赞一下

回复此楼

2楼2012-09-15 14:24:17

已阅回复此楼关注TA 给TA发消息送TA红花 TA的回帖

相关版块跳转我要订阅楼主 sic029 的主题更新

返回列表

最具人气热帖推荐 [查看全部]		作者	回/看	最后发表

[考研] 293求调剂 +4	zjl的号 2026-03-16	5/250	2026-03-17 09:46 by peike
[考研] 考研化学学硕调剂，一志愿985 +3	张vvvv 2026-03-15	5/250	2026-03-16 20:25 by 张vvvv
[考研] 328求调剂，英语六级551，有科研经历 +3	生物工程调剂 2026-03-16	4/200	2026-03-16 20:13 by Wangjingyue
[考研] 0854控制工程 359求调剂可跨专业 +3	626776879 2026-03-14	9/450	2026-03-16 17:42 by 626776879
[考研] 一志愿苏州大学材料工程（085601）专硕有科研经历三项国奖两个实用型专利一项省级立项 +3	大火山小火山 2026-03-16	5/250	2026-03-16 16:54 by barlinike
[考研] 070303 总分349求调剂 +3	LJY9966 2026-03-15	5/250	2026-03-16 14:24 by xwxstudy
[考研] 0703化学调剂 290分有科研经历，论文在投 +7	腻腻gk 2026-03-14	7/350	2026-03-16 10:12 by houyaoxu
[考研] 344求调剂 +3	knight344 2026-03-16	3/150	2026-03-16 09:42 by 无际的草原
[考研] 290求调剂 +5	孔志浩 2026-03-12	10/500	2026-03-16 09:01 by 余晖&
[考研] 288求调剂 +4	奇点0314 2026-03-14	4/200	2026-03-14 23:04 by JourneyLucky
[考研] 本科南京大学一志愿川大药学327 +3	麦田耕者 2026-03-14	3/150	2026-03-14 20:04 by 外星文明
[考研] 【0703化学调剂】-一志愿华中师范大学-六级475 +5	Becho359 2026-03-11	5/250	2026-03-14 11:35 by 哦哦123
[考研] 266求调剂 +4	学员97LZgn 2026-03-13	4/200	2026-03-14 08:37 by zhukairuo
[考研] 一志愿中科院，化学方向，295求调剂 +4	一氧二氮 2026-03-11	4/200	2026-03-13 22:35 by JourneyLucky
[考研] 308求调剂 +5	是Lupa啊 2026-03-11	5/250	2026-03-13 22:13 by JourneyLucky
[考研] 310求调剂 +3	【上上签】 2026-03-11	3/150	2026-03-13 16:16 by JourneyLucky
[考研] 工科材料085601 279求调剂 +8	困于星晨 2026-03-12	10/500	2026-03-13 15:42 by ms629
[考研] 工科0856专硕化学工程269能调剂吗 +10	我想读研11 2026-03-10	10/500	2026-03-13 10:14 by Yuyi.
[基金申请] 提交后的基金本子，已让学校撤回了，可否换口子提交 +3	dut_pfx 2026-03-10	3/150	2026-03-11 08:38 by kudofaye
[考研] 收调剂 +7	调剂的考研学生 2026-03-10	7/350	2026-03-10 17:57 by 麦茶汤圆