24小时热门版块排行榜    

北京石油化工学院2026年研究生招生接收调剂公告
查看: 2048  |  回复: 1

qh203

铜虫 (小有名气)

[求助] root和普通用户下并行计算问题

在root用户下,用openmpi并行计算cpi 这个算例,6个节点,每个节点8个cpu。输出正常,如下

[root@node1 examples]# mpirun -np 40 -machinefile test ./cpi
Process 3 on node2
Process 38 on node6
Process 18 on node4
Process 32 on node6
Process 20 on node4
Process 2 on node2
Process 35 on node6
Process 34 on node6
Process 22 on node4
Process 7 on node2
Process 23 on node4
Process 5 on node2
Process 4 on node2
Process 37 on node6
Process 33 on node6
Process 30 on node5
Process 8 on node3
Process 26 on node5
Process 10 on node3
Process 15 on node3
Process 27 on node5
Process 31 on node5
Process 28 on node5
Process 24 on node5
Process 19 on node4
Process 21 on node4
Process 17 on node4
Process 6 on node2
Process 16 on node4
Process 25 on node5
Process 9 on node3
Process 11 on node3
Process 13 on node3
Process 14 on node3
Process 0 on node2
Process 1 on node2
Process 36 on node6
Process 39 on node6
Process 12 on node3
Process 29 on node5
pi is approximately 3.1416009869231245, Error is 0.0000083333333314
wall clock time = 0.128546

在普通用户下用openmpi并行计算cpi这个算例,输出则变成

[aojjj@node1 examples]$ mpirun -np 40 -machinefile test ./cpi
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to register memory in the driver.
Please check /var/log/messages or dmesg for driver specific failure
reason.
The failure occured here:

  Local host:    mthca0
  Device:        openib_reg_mr
  Function:      Cannot allocate memory()
  Errno says:   

You may need to consult with your system administrator to get this
problem fixed.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory.  This typically can indicate that the
memlock limits are set too low.  For most HPC installations, the
memlock limits should be set to "unlimited".  The failure occured
here:

  Local host:    node4
  OMPI source:   btl_openib_component.c:1161
  Function:      ompi_free_list_init_ex_new()
  Device:        mthca0
  Memlock limit: 32768

You may need to consult with your system administrator to get this
problem fixed.  This FAQ entry on the Open MPI web site may also be
helpful:

    http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.

  Local host:   node4
  Local device: mthca0
--------------------------------------------------------------------------
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
libibverbs: Warning: RLIMIT_MEMLOCK is 32768 bytes.
    This will severely limit memory registrations.
Process 26 on node5
Process 8 on node3
Process 28 on node5
Process 1 on node2
Process 29 on node5
Process 4 on node2
Process 22 on node4
Process 2 on node2
Process 15 on node3
Process 25 on node5
Process 31 on node5
Process 38 on node6
Process 14 on node3
Process 30 on node5
Process 32 on node6
Process 39 on node6
Process 37 on node6
Process 33 on node6
Process 36 on node6
Process 35 on node6
Process 16 on node4
Process 18 on node4
Process 10 on node3
Process 21 on node4
Process 19 on node4
Process 20 on node4
Process 11 on node3
Process 17 on node4
Process 9 on node3
Process 0 on node2
Process 7 on node2
Process 6 on node2
Process 5 on node2
Process 23 on node4
Process 24 on node5
Process 3 on node2
Process 27 on node5
Process 34 on node6
Process 12 on node3
Process 13 on node3
pi is approximately 3.1416009869231245, Error is 0.0000083333333314
wall clock time = 3.002147
[node1:02112] 39 more processes have sent help message help-mpi-btl-openib.txt / mem-reg-fail
[node1:02112] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[node1:02112] 36 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem
[node1:02112] 39 more processes have sent help message help-mpi-btl-openib.txt / error in device init

也计算出来了,但是多了许多warniing 和error的提示。

在各个节点修改了/etc/security/limits.conf 和/etc/init.d/sshd, 还是不行。

到底问题在哪里?
回复此楼

» 猜你喜欢

» 本主题相关价值贴推荐,对您同样有帮助:

已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖

qh203

铜虫 (小有名气)

这个问题我自己已经解决了。 普通用户的memlock不够。root用户下,在每个节点的/etc/security/limits.conf文件里增加两行
某个普通用户名 soft memlock unlimited
某个普通用户名 hard memlock unlimited

然后要重启每个服务器节点。<----这一点很重要,否则切换到普通用户下,会出现
memlock cannot modify limit: Operation not permitte.
2楼2013-10-13 21:22:41
已阅   回复此楼   关注TA 给TA发消息 送TA红花 TA的回帖
相关版块跳转 我要订阅楼主 qh203 的主题更新
最具人气热帖推荐 [查看全部] 作者 回/看 最后发表
[考研] 085701求调剂初试286分 +3 secret0328 2026-03-28 3/150 2026-03-28 20:44 by ahhshq
[考研] 本科新能源科学与工程,一志愿华理能动285求调剂 +6 AZMK 2026-03-28 10/500 2026-03-28 20:42 by lbsjt
[考研] 346求调剂 一志愿070303有机化学 +3 萝卜炖青菜 2026-03-28 3/150 2026-03-28 14:11 by 唐沐儿
[考研] 085600 286分 材料求调剂 +7 麻辣鱿鱼 2026-03-27 8/400 2026-03-28 12:17 by zllcz
[考研] 085404求调剂,总分309,本科经历较为丰富 +4 来财aa 2026-03-25 4/200 2026-03-28 07:41 by 棒棒球手
[考研] 086502化学工程342求调剂 +6 阿姨复古不过 2026-03-27 6/300 2026-03-28 07:06 by wangy0907
[考研] 070300化学求调剂 +4 起个名咋这么难 2026-03-27 4/200 2026-03-27 21:39 by 83503孙老师
[考研] 一志愿211院校 344分 东北农业大学生物学学硕,求调剂 +5 丶风雪夜归人丶 2026-03-26 8/400 2026-03-27 19:22 by 丶风雪夜归人丶
[考研] 348求调剂 +4 小懒虫不懒了 2026-03-27 5/250 2026-03-27 12:47 by 果果妈咪
[考研] 312求调剂 +9 上岸吧ZJY 2026-03-22 13/650 2026-03-27 11:24 by sanrepian
[考研] 调剂推荐 +5 清酒714 2026-03-26 6/300 2026-03-27 11:12 by 不吃魚的貓
[考研] 一志愿吉大071010,316分求调剂 +3 xgbiknn 2026-03-27 3/150 2026-03-27 10:36 by guoweigw
[考研] 336材料求调剂 +7 陈滢莹 2026-03-26 9/450 2026-03-27 00:20 by wxiongid
[考研] 327求调剂 +7 prayer13 2026-03-23 7/350 2026-03-26 20:48 by 不吃魚的貓
[考研] 334分 一志愿武理 材料求调剂 +4 李李不服输 2026-03-26 4/200 2026-03-26 16:00 by 不吃魚的貓
[考研] 上海电力大学材料防护与新材料重点实验室招收调剂研究生(材料、化学、电化学,环境) +4 我爱学电池 2026-03-23 4/200 2026-03-25 00:59 by 1027_324
[考研] 277分求调剂,跨调材料 +3 考研调剂lxh 2026-03-24 3/150 2026-03-24 13:52 by JourneyLucky
[考研] 333求调剂 +3 ALULU4408 2026-03-23 3/150 2026-03-23 19:04 by macy2011
[考研] 求老师收我 +3 zzh16938784 2026-03-23 3/150 2026-03-23 12:56 by ztnimte
[考研] 285求调剂 +6 ytter 2026-03-22 6/300 2026-03-22 12:09 by 星空星月
信息提示
请填处理意见