Problem executing in Parallel
Posted: Fri May 05, 2006 11:15 am
Hello everybody!
I am running vasp on a 64bit-Linux cluster (2xOpteron CPUs per Node). Compiler is pgf90 5.2-4, MPI Version is MPICH2-1.0.3, vasp version is 4.6.28.
The serial version of vasp runs fine.
The parallel version compiles and gets linked to MPICH2, which I compiled myself using the configure options suggested in the vasp-Makefile. The header mpif.h is copied to the vasp build directory (Do I need to convert to F90? Where is the tool "Convert"? Seems to work this way though.). For testing purposes however I turned the optimization off (vasp and mpich2 with -O0). Compiling with -O3 isnt different though.
Makefile setting:
FC=pgf90
FCL=mpif90
SCA=
After booting the MPI environment (mpdboot -f hosts) with just the local Server (=2 CPUs), I try to start vasp (mpiexec -np 2 ./vasp) in parallel withe following INCAR:
------------------------------------
System = Bulk-Au (fcc)
LPLANE = .TRUE.
NPAR = 2
LSCALU = .FALSE.
NSIM = 2
IALGO = 48
NBANDS = 8
ISMEAR = 1
SIGMA = 0.40
NELM = 5
-----------------------------------
That produces the following output on Stdout:
[cli_0]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
[cli_1]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
rank 1 in job 7 rzcluster.rz.uni-kiel.de_43986 caused collective abort of all ranks
exit status of rank 1: return code 13
Does anybody know what went wrong?
I appreciate your help.
I am running vasp on a 64bit-Linux cluster (2xOpteron CPUs per Node). Compiler is pgf90 5.2-4, MPI Version is MPICH2-1.0.3, vasp version is 4.6.28.
The serial version of vasp runs fine.
The parallel version compiles and gets linked to MPICH2, which I compiled myself using the configure options suggested in the vasp-Makefile. The header mpif.h is copied to the vasp build directory (Do I need to convert to F90? Where is the tool "Convert"? Seems to work this way though.). For testing purposes however I turned the optimization off (vasp and mpich2 with -O0). Compiling with -O3 isnt different though.
Makefile setting:
FC=pgf90
FCL=mpif90
SCA=
After booting the MPI environment (mpdboot -f hosts) with just the local Server (=2 CPUs), I try to start vasp (mpiexec -np 2 ./vasp) in parallel withe following INCAR:
------------------------------------
System = Bulk-Au (fcc)
LPLANE = .TRUE.
NPAR = 2
LSCALU = .FALSE.
NSIM = 2
IALGO = 48
NBANDS = 8
ISMEAR = 1
SIGMA = 0.40
NELM = 5
-----------------------------------
That produces the following output on Stdout:
[cli_0]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
[cli_1]: aborting job:
Fatal error in MPI_Cart_sub: Invalid communicator, error stack:
MPI_Cart_sub(194): MPI_Cart_sub(MPI_COMM_NULL, remain_dims=0xadda80, comm_new=0xd0f300) failed
MPI_Cart_sub(76).: Null communicator
rank 1 in job 7 rzcluster.rz.uni-kiel.de_43986 caused collective abort of all ranks
exit status of rank 1: return code 13
Does anybody know what went wrong?
I appreciate your help.