I have compiled VASP 4.6 on a HP XC (opteron) cluster. When I try to run one of the given benchmarks to verify the installation I am getting the following.
[root@n20 bench]# mpirun /usr/local/apps/vasp/src/vasp.4.6/vasp
running on 1 nodes
distr: one band on 1 nodes, 1 groups
vasp.4.6.34 5Dec07 complex
POSCAR found : 1 types and 50 ions
MPI Application rank 0 killed before MPI_Finalize() with signal 11
Jim
VASP 4.6 run error: MPI_Finalize
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 5
- Joined: Tue Jul 08, 2008 4:58 am
- License Nr.: 934
VASP 4.6 run error: MPI_Finalize
Last edited by snoopjd on Tue Jul 08, 2008 5:32 am, edited 1 time in total.
-
- Administrator
- Posts: 2921
- Joined: Tue Aug 03, 2004 8:18 am
- License Nr.: 458
VASP 4.6 run error: MPI_Finalize
please first of all have a look at INCAR: IALGO must not be set to 8, this causes an immediate stop of vasp, due to copyright reasons. replace "8" by 38, or set "ALGO=Normal" instead.
Then check if the job runs serial (the version compiled for serial execution only, without any access to MPI): if the job finishes successfully when run interactively (takes just a few minutes on one cpu), there is probably an error in reading POTCAR in the job started with mpirun
concerning the signal 11 MPI_finalize error itself:
the parallel job crashed when/after reading the POTCAR file (possibly from the master node if there is a special parallel queue where you submitted it to).
Please make sure that POTCAR can be accessed properly during the run:
in the job.e (job error file) you should find a message
Input/Output Error 152: File does not exist
In Procedure: pseudo..rd_pseudo
At Line: 81
Statement: OPEN
Unit: 10
File: POTCAR
if this is not the case.
Then check if the job runs serial (the version compiled for serial execution only, without any access to MPI): if the job finishes successfully when run interactively (takes just a few minutes on one cpu), there is probably an error in reading POTCAR in the job started with mpirun
concerning the signal 11 MPI_finalize error itself:
the parallel job crashed when/after reading the POTCAR file (possibly from the master node if there is a special parallel queue where you submitted it to).
Please make sure that POTCAR can be accessed properly during the run:
in the job.e (job error file) you should find a message
Input/Output Error 152: File does not exist
In Procedure: pseudo..rd_pseudo
At Line: 81
Statement: OPEN
Unit: 10
File: POTCAR
if this is not the case.
Last edited by admin on Tue Jul 08, 2008 5:28 pm, edited 1 time in total.
-
- Newbie
- Posts: 5
- Joined: Tue Jul 08, 2008 4:58 am
- License Nr.: 934
VASP 4.6 run error: MPI_Finalize
The code works for both benchmarks when I do a serial run. When I compile the code using makefile.linux_pgi_opt I receive the following error during the run.
[root@n62 src]# mpirun -d /usr/local/apps/vasp/src/vasp.4.6/vasp
debug 1, pretend 0, verbose 1
job 0, check 0, tv=0, mpirun_instr
remsh = /usr/bin/ssh
SPMD cmd: /usr/local/apps/vasp/src/vasp.4.6/vasp
Main socket port 44240
Temporary appfile: /tmp/mpiafSOwOet
Parsing application description...
Identifying hosts...
Spawning processes...
Process layout for world 0 is as follows:
mpirun: proc 11345
daemon proc 11348 on host 172.20.0.62
rank 0: proc 11355
running on 1 nodes
vasp: Rank 0:0: MPI_Cart_create: Invalid topology MPI Application rank 0 exited before MPI_Finalize() with status 10
It was suggested in one of the user forums to change the following Fortran flags in the Makefile.
http://cms.mpi.univie.ac.at/vasp-forum/ ... c.php?2.40
FFLAGS = -Mfree -tp k8-64 -i8 --> FFLAGS = -Mfree -tp k8-64 -i4
When I compile the source with this option I get the previous reported error
[root@n62 src]# mpirun -d /usr/local/apps/vasp/src/vasp.4.6/vasp
debug 1, pretend 0, verbose 1
job 0, check 0, tv=0, mpirun_instr
remsh = /usr/bin/ssh
SPMD cmd: /usr/local/apps/vasp/src/vasp.4.6/vasp
Main socket port 44699
Temporary appfile: /tmp/mpiafRSb2PT
Parsing application description...
Identifying hosts...
Spawning processes...
Process layout for world 0 is as follows:
mpirun: proc 13753
daemon proc 13756 on host 172.20.0.62
rank 0: proc 13763
running on 1 nodes
distr: one band on 1 nodes, 1 groups
vasp.4.6.34 5Dec07 complex
POSCAR found : 1 types and 8 ions
MPI Application rank 0 killed before MPI_Finalize() with signal 11
[root@n62 src]# mpirun -d /usr/local/apps/vasp/src/vasp.4.6/vasp
debug 1, pretend 0, verbose 1
job 0, check 0, tv=0, mpirun_instr
remsh = /usr/bin/ssh
SPMD cmd: /usr/local/apps/vasp/src/vasp.4.6/vasp
Main socket port 44240
Temporary appfile: /tmp/mpiafSOwOet
Parsing application description...
Identifying hosts...
Spawning processes...
Process layout for world 0 is as follows:
mpirun: proc 11345
daemon proc 11348 on host 172.20.0.62
rank 0: proc 11355
running on 1 nodes
vasp: Rank 0:0: MPI_Cart_create: Invalid topology MPI Application rank 0 exited before MPI_Finalize() with status 10
It was suggested in one of the user forums to change the following Fortran flags in the Makefile.
http://cms.mpi.univie.ac.at/vasp-forum/ ... c.php?2.40
FFLAGS = -Mfree -tp k8-64 -i8 --> FFLAGS = -Mfree -tp k8-64 -i4
When I compile the source with this option I get the previous reported error
[root@n62 src]# mpirun -d /usr/local/apps/vasp/src/vasp.4.6/vasp
debug 1, pretend 0, verbose 1
job 0, check 0, tv=0, mpirun_instr
remsh = /usr/bin/ssh
SPMD cmd: /usr/local/apps/vasp/src/vasp.4.6/vasp
Main socket port 44699
Temporary appfile: /tmp/mpiafRSb2PT
Parsing application description...
Identifying hosts...
Spawning processes...
Process layout for world 0 is as follows:
mpirun: proc 13753
daemon proc 13756 on host 172.20.0.62
rank 0: proc 13763
running on 1 nodes
distr: one band on 1 nodes, 1 groups
vasp.4.6.34 5Dec07 complex
POSCAR found : 1 types and 8 ions
MPI Application rank 0 killed before MPI_Finalize() with signal 11
Last edited by snoopjd on Tue Jul 08, 2008 7:00 pm, edited 1 time in total.
-
- Newbie
- Posts: 5
- Joined: Tue Jul 08, 2008 4:58 am
- License Nr.: 934
VASP 4.6 run error: MPI_Finalize
Vasp-support,
I have found that added -i8 to the LINK= line (possible bug) resolved the parallel issue. Also aedens.o is missing from SOURCE= in makefile.linux_pgi_opt.
I have found that added -i8 to the LINK= line (possible bug) resolved the parallel issue. Also aedens.o is missing from SOURCE= in makefile.linux_pgi_opt.
Last edited by snoopjd on Fri Jul 11, 2008 1:49 am, edited 1 time in total.