With OpenMPI, error occurs
Posted: Fri Apr 08, 2016 9:17 am
Hi,
When I build VASP GPU with OpenMPI 1.8.8, I get error messages below.
This occurs regardless of using BLACS and ScaLAPACK while building VASP GPU.
(The gpu itself works fine, because GROMACS runs on it successfully.)
Thanks,
[vasp-gpu:20649] *** An error occurred in MPI_Comm_split
[vasp-gpu:20649] *** reported by process [18446744072720351233,0]
[vasp-gpu:20649] *** on communicator MPI_COMM_WORLD
[vasp-gpu:20649] *** MPI_ERR_ARG: invalid argument of some other kind
[vasp-gpu:20649] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[vasp-gpu:20649] *** and potentially your MPI job)
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[50442,1],0]
Exit code: 13
--------------------------------------------------------------------------
When I build VASP GPU with OpenMPI 1.8.8, I get error messages below.
This occurs regardless of using BLACS and ScaLAPACK while building VASP GPU.
(The gpu itself works fine, because GROMACS runs on it successfully.)
Thanks,
[vasp-gpu:20649] *** An error occurred in MPI_Comm_split
[vasp-gpu:20649] *** reported by process [18446744072720351233,0]
[vasp-gpu:20649] *** on communicator MPI_COMM_WORLD
[vasp-gpu:20649] *** MPI_ERR_ARG: invalid argument of some other kind
[vasp-gpu:20649] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[vasp-gpu:20649] *** and potentially your MPI job)
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[50442,1],0]
Exit code: 13
--------------------------------------------------------------------------