| 查看: 635 | 回复: 2 | |||
| 当前主题已经存档。 | |||
wuchenwf荣誉版主 (职业作家)
|
[交流]
安装mpich2成功后 运行mpdtrace 出现的问题 谢谢各位了
|
||
|
各位前辈,大家好。小弟我的机器是4核intel的机器 采用ifort 安装mpich2出现文件夹 以及所需要的 文件如 mpd 等 运行mpd 成功 但是运行mpdtrace后出现错误,以下是我的操作内容和错误内容(其中我的机器名为wuchenwf-desktop mpich2安装路径为/opt/mpich2) --------------------------------------------------------------------- root@wuchenwf-desktop:~# mpd & [1] 8542 root@wuchenwf-desktop:~# mpdtrace mpdtrace (send_dict_msg 426):send_dict_msg: sock= errmsg= 32, 'Broken pipe'): mpdtb: /opt/mpich2/bin/mpdlib.py, 426, send_dict_msg /opt/mpich2/bin/mpdtrace, 51, mpdtrace /opt/mpich2/bin/mpdtrace, 83, mpdtrace: unexpected msg from mpd=:{'error_msg': 'invalid secretword to root mpd'}: -------------------------------------------------------------------------- 我看好象是两个错误,而且运行mpdallexit也出现相近问题 ,请问这个错误该如何解决,麻烦各位了 十分感谢 |
» 猜你喜欢
深圳大学2026年秋博士招生-物理学-活性胶体方向-高永祥课题组
已经有18人回复
论物质与能量的统一模型及物理现象解释
已经有0人回复
物理学I论文润色/翻译怎么收费?
已经有60人回复
基于基元I统一理论的数学相关应用推导
已经有0人回复
基元I统一理论:宇宙本质、层级演化与修炼文明的本源规律
已经有1人回复
基元I理论下三大核心空间现象精准推导与细节解析
已经有0人回复
基于基元 I 统一理论的反重力理论推导
已经有0人回复
基于基元I统一理论的量子力学本源推导
已经有1人回复
推荐一款可以AI辅助写作的Latex编辑器SmartLatexEditor,超级好用,AI润色,全免费
已经有17人回复
【EI|Scopus 双检索】第六届智能机器人系统国际会议(ISoIRS 2026)
已经有0人回复
2026年第四届电动车与车辆工程国际会议(CEVVE 2026)
已经有0人回复
alwens
铁杆木虫 (正式写手)
老木虫
- 应助: 0 (幼儿园)
- 贵宾: 0.45
- 金币: 5208.8
- 散金: 50
- 帖子: 994
- 在线: 806小时
- 虫号: 8486
- 注册: 2003-04-27
- 性别: GG
- 专业: 药物设计与药物信息
|
收藏下面这篇文章对MPICH2的用户很有用。 附:mpich2运行mpd错误debug 1. Install mpich2, and thus mpd. 2. Make sure the mpich2 bin directory is in your path. Below, we will refer to it as MPDDIR. 3. Kill old mpd processes. If you are coming to this guide from elsewhere, e.g. a Quick Start guide for mpich2, because you encountered mpd problems, you should make sure that all mpd processes are terminated on the hosts where you have been testing. mpdallexit may assist in this, but probably not if you were having problems. You may need to use the Unix kill command to terminate the processes. 4. Run a first mpd (alone on a first node). As mentioned above, mpd uses client-server communications to perform its work. So, before running an mpd, let's run a simpler program (mpdcheck) to verify that these communications are likely to be successful. Even on hosts where communications are well supported, sometimes there are problems associated with hostname resolution, etc. So, it is worth the effort to proceed a bit slowly. Below, we assume that you have installed mpd and have it in your path. Select a test node, let's call it n1. Login to n1. First, we will run mpdcheck as a server and a client. To run it as a server, get into a window with a command-line and run this: n1 $ mpdcheck -s It will print something like this: server listening at INADDR_ANY on: n1 1234 Now, run the client side (in another window if convenient) and see if it can find the server and communicate. Be sure to use the same hostname and portnumber printed by the server (above: n1 1234): n1 $ mpdcheck -c n1 1234 If all goes well, the server will print something like: server has conn on from ('192.168.1.1', 1234) server successfully recvd msg from client: hello_from_client_to_server A TROUBLESHOOTING MPDS 29 and the client will print: client successfully recvd ack from server: ack_from_server_to_client If the experiment failed, you have some network or machine configuration problem which will also be a problem later when you try to use mpd. Even if the experiment succeeded, but the hostname printed by the server was localhost, then you will probably have problems later if you try to use mpd on n1 in conjunction with other hosts. In either case, skip to Section A.2 "Debugging host/network configuration problems." If the experiment succeeded, then you should be ready to try mpd on this one host. To start an mpd, you will use the mpd command. To run parallel programs, you will use the mpiexec program. All mpd commands accept the -h or -help arguments, e.g.: n1 $ mpd --help n1 $ mpiexec --help Try a few tests: n1 $ mpd & n1 $ mpiexec -n 1 /bin/hostname n1 $ mpiexec -l -n 4 /bin/hostname n1 $ mpiexec -n 2 PATH_TO_MPICH2_EXAMPLES/cpi where PATH TO MPICH2 EXAMPLES is the path to the mpich2-1.0.3/examples directory. To terminate the mpd: n1 $ mpdallexit 5. Run a second mpd (alone on a second node). To verify that things are fine on a second host (say n2 ), login to n2 and perform the same set of tests that you did on n1. Make sure that you use mpdallexit to terminate the mpd so you will be ready for further tests. A TROUBLESHOOTING MPDS 30 6. Run a ring of two mpds on two hosts. Before running a ring of mpds on n1 and n2, we will again use mpdcheck, but this time between the two machines. We do this because the two nodes may have trouble locating each other or communicating between them and it is easier to check this out with the smaller program. First, we will make sure that a server on n1 can service a client from n2. On n1: n1 $ mpdcheck -s which will print a hostname (hopefully n1) and a portnumber (say 3333 here). On n2: n2 $ mpdcheck -c n1 3333 If this experiment fails, skip to Section A.2 "Debugging host/network configuration problems". Second, we will make sure that a server on n2 can service a client from n1. On n2: n2 $ mpdcheck -s which will print a hostname (hopefully n2) and a portnumber (say 7777 here). On n2: n2 $ mpdcheck -c n2 7777 If this experiment fails, skip to Section A.2 "Debugging host/network configuration problems". If all went well, we are ready to try a pair of mpds on n1 and n2. First, make sure that all mpds have terminated on both n1 and n2. Use mpdallexit or simply kill them with: kill -9 PID_OF_MPD where you have obtained the PID OF MPD by some means such as the ps command. On n1: A TROUBLESHOOTING MPDS 31 n1 $ mpd & n1 $ mpdtrace -l This will print a list of machines in the ring, in this case just n1. The output will be something like: n1_6789 (192.168.1.1) The 6789 is the port that the mpd is listeneing on for connections from other mpds wishing to enter the ring. We will use that port in a moment to get an mpd from n2 into the ring. The value in parentheses should be the IP address of n1. On n2: n2 $ mpd -h n1 -p 6789 & where 6789 is the listening port on n1 (from mpdtrace above). Now try: n2 $ mpdtrace -l You should see both mpds in the ring. To run some programs in parallel: n1 $ mpiexec -n 2 /bin/hostname n1 $ mpiexec -n 4 /bin/hostname n1 $ mpiexec -l -n 4 /bin/hostname n1 $ mpiexec -l -n 4 PATH_TO_MPICH2_EXAMPLES/cpi where PATH TO MPICH2 EXAMPLES is the path to the mpich2-1.0.5/examples directory. To bring down the ring of mpds: n1 $ mpdallexit 7. Boot a ring of two mpds via mpdboot. Please be aware that mpdboot uses ssh by default to start remote mpds. It will expect that you can run ssh from n1 to n2 (and from n2 to n1) without entering a password. First, make sure that you terminate the mpd processes from any prior tests. On n1, create a file named mpd.hosts containing the name of n2: A TROUBLESHOOTING MPDS 32 n2 Then, on n1 run: n1 $ mpdboot -n 2 n1 $ mpdtrace -l n1 $ mpiexec -l -n 2 /bin/hostname The mpdboot command should read the mpd.hosts file created above and run an mpd on each of the two machines. The mpdtrace and mpiexec show the ring up and functional. Options that may be useful are: · --help use this one for extra details on all options · -v (verbose) · --chkup tries to verify that the hosts are up before starting mpds · --chkuponly only performs the verify step, then ends To bring the ring down: n1 $ mpdallexit If mpdboot works on the two machines n1 and n2, it will probably work on your others as well. But, there could be configuration problems using a new machine on which you have not yet tested mpd. An easy way to check, is to gradually add them to mpd.hosts and try an mpdboot with a -n arg that uses them all each time. Use mpdallexit after each test. [ Last edited by alwens on 2008-1-18 at 14:32 ] |

2楼2008-01-18 14:31:21













32, 'Broken pipe'):
回复此楼