Problems with MPI vasp at runtime
Moderators: Global Moderator, Moderator
-
- Newbie
- Posts: 5
- Joined: Thu Jun 08, 2006 7:33 pm
Problems with MPI vasp at runtime
I am a sysadmin helping a user install vasp on our linux (RHEL 4.0) opteron cluster. The compiler is PGIF90 6.1 and the MPI lib is OpenMPI 1.0.2 the serial version builds and runs just fine but the paralell version gives the following error,
running on 2 nodes
[nyx.engin.umich.edu:28430] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28430] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28430] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28430] *** MPI_ERRORS_ARE_FATAL (goodbye)
[nyx.engin.umich.edu:28431] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28431] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28431] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28431] *** MPI_ERRORS_ARE_FATAL (goodbye)
1 additional process aborted (not shown)
This is a regular OMPI error, and i have contacted the devs of openmpi, i am posting here to see if this is a problem anyone else has seen and if so how/if they were able to fix this problem.
Brock
running on 2 nodes
[nyx.engin.umich.edu:28430] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28430] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28430] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28430] *** MPI_ERRORS_ARE_FATAL (goodbye)
[nyx.engin.umich.edu:28431] *** An error occurred in MPI_Cart_create
[nyx.engin.umich.edu:28431] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:28431] *** MPI_ERR_OTHER: known error not in list
[nyx.engin.umich.edu:28431] *** MPI_ERRORS_ARE_FATAL (goodbye)
1 additional process aborted (not shown)
This is a regular OMPI error, and i have contacted the devs of openmpi, i am posting here to see if this is a problem anyone else has seen and if so how/if they were able to fix this problem.
Brock
Last edited by brockp on Thu Jun 08, 2006 7:45 pm, edited 1 time in total.
-
- Newbie
- Posts: 5
- Joined: Thu Jun 08, 2006 7:33 pm
Problems with MPI vasp at runtime
Looks like the problem isnt with openMPI here is the result rebuilding everythign with lam-7.1.2
bash-3.00$ mpirun -np 2 ./vasp
running on 2 nodes
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (1, MPI_COMM_WORLD): - main()
Could it be a problem with the users input that the problem cant be broken down correctly ? This input works fine on the serial version of vasp
bash-3.00$ mpirun -np 2 ./vasp
running on 2 nodes
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 0, MPI_COMM_WORLD)
Rank (0, MPI_COMM_WORLD): Call stack within LAM:
Rank (0, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (0, MPI_COMM_WORLD): - main()
MPI_Cart_create: invalid dimension argument: Invalid argument (rank 1, MPI_COMM_WORLD)
Rank (1, MPI_COMM_WORLD): Call stack within LAM:
Rank (1, MPI_COMM_WORLD): - MPI_Cart_create()
Rank (1, MPI_COMM_WORLD): - main()
Could it be a problem with the users input that the problem cant be broken down correctly ? This input works fine on the serial version of vasp
Last edited by brockp on Fri Jun 09, 2006 2:03 pm, edited 1 time in total.
-
- Jr. Member
- Posts: 55
- Joined: Tue Aug 16, 2005 7:44 am
Problems with MPI vasp at runtime
Have you compiled vasp with -i8 and the mpi library with default settings? That won't work.
Last edited by job on Wed Jun 14, 2006 11:03 am, edited 1 time in total.
-
- Newbie
- Posts: 7
- Joined: Tue Nov 15, 2005 9:01 am
Problems with MPI vasp at runtime
Hi,
We have the similar problem, have you solved the problem yet?
We have the similar problem, have you solved the problem yet?
Last edited by c00jsh00 on Thu Jun 29, 2006 5:22 am, edited 1 time in total.
-
- Administrator
- Posts: 2921
- Joined: Tue Aug 03, 2004 8:18 am
- License Nr.: 458
Problems with MPI vasp at runtime
please check if the LAM was compiled in the same bit-mode as you used for the compilation of vasp
Last edited by admin on Mon Jul 24, 2006 11:20 am, edited 1 time in total.
-
- Newbie
- Posts: 5
- Joined: Thu Jun 08, 2006 7:33 pm
Problems with MPI vasp at runtime
[quote="job"]Have you compiled vasp with -i8 and the mpi library with default settings? That won't work.[/quote]$ mpirun -np 2 -v ./vasp
running on 2 nodes
[nyx.engin.umich.edu:31483] *** An error occurred in MPI_Cartdim_get
[nyx.engin.umich.edu:31483] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:31483] *** MPI_ERR_COMM: invalid communicator
[nyx.engin.umich.edu:31483] *** MPI_ERRORS_ARE_FATAL (goodbye)
distr: one band on 1 nodes, 1 groups
1 process killed (possibly by Open MPI)
So i still have not made any progress. I also added the -Ddebug to the flags, but vasp did not display anything.
Also what does -Dkind8 mean?
running on 2 nodes
[nyx.engin.umich.edu:31483] *** An error occurred in MPI_Cartdim_get
[nyx.engin.umich.edu:31483] *** on communicator MPI_COMM_WORLD
[nyx.engin.umich.edu:31483] *** MPI_ERR_COMM: invalid communicator
[nyx.engin.umich.edu:31483] *** MPI_ERRORS_ARE_FATAL (goodbye)
distr: one band on 1 nodes, 1 groups
1 process killed (possibly by Open MPI)
So i still have not made any progress. I also added the -Ddebug to the flags, but vasp did not display anything.
Also what does -Dkind8 mean?
Last edited by brockp on Tue Sep 26, 2006 7:02 pm, edited 1 time in total.
-
- Newbie
- Posts: 5
- Joined: Thu Jun 08, 2006 7:33 pm
Problems with MPI vasp at runtime
The problem was solved using the following:
lam-7.1.2
Open MPI would not work with vasp this is unfortonate, both mpich and lam are nolonger dev. Moving to more uptodate MPI libs like OpenMPI would be a plus in the future. Im not sure if its OpenMPI or VASP causing the problem so i will pass it on to the OMPI devs see if we can fix it.
PGI 6.1 -i4
Matchin the size of LOGICALS and such was a real pain, Its not documented anyware but the default PGI make file for linux has -i8 in the Makefiles. This caused quite a headache. This MUST match what your MPI lib was built with.
VASP is running now int MPI GoTO was very slow on the example case i had (dont know why) ACML3.5 was slightly faster than ATLAS.
This was on OPT 244 with GIG-E non blocking + Jumbo frames networking. Hope this helps anyone else. If you want I can provide Makefiles for anyone having trouble.
Brock
1(734)936-1985
Center for Advanced Computing
University of Michigan (Ann Arbor)
lam-7.1.2
Open MPI would not work with vasp this is unfortonate, both mpich and lam are nolonger dev. Moving to more uptodate MPI libs like OpenMPI would be a plus in the future. Im not sure if its OpenMPI or VASP causing the problem so i will pass it on to the OMPI devs see if we can fix it.
PGI 6.1 -i4
Matchin the size of LOGICALS and such was a real pain, Its not documented anyware but the default PGI make file for linux has -i8 in the Makefiles. This caused quite a headache. This MUST match what your MPI lib was built with.
VASP is running now int MPI GoTO was very slow on the example case i had (dont know why) ACML3.5 was slightly faster than ATLAS.
This was on OPT 244 with GIG-E non blocking + Jumbo frames networking. Hope this helps anyone else. If you want I can provide Makefiles for anyone having trouble.
Brock
1(734)936-1985
Center for Advanced Computing
University of Michigan (Ann Arbor)
Last edited by brockp on Tue Oct 03, 2006 1:07 pm, edited 1 time in total.
-
- Newbie
- Posts: 10
- Joined: Tue Mar 22, 2005 7:45 am
- License Nr.: 221
Problems with MPI vasp at runtime
I met similar problem. In my case, I could solve it by following:
Set the compiler path directory like as,
FC=/usr/foo/bar/bin/mpif90
'FC=mpif90' with $PATH doesn't work. A shared library is missing. I don't know why.
Another choice is to link staticaly ( libmpi.a, liborte.a and libopal.a in openmpi case). Remenber to copy header files from 'include' directory in openmpi.
Set the compiler path directory like as,
FC=/usr/foo/bar/bin/mpif90
'FC=mpif90' with $PATH doesn't work. A shared library is missing. I don't know why.
Another choice is to link staticaly ( libmpi.a, liborte.a and libopal.a in openmpi case). Remenber to copy header files from 'include' directory in openmpi.
Last edited by atogo on Tue Oct 17, 2006 6:06 am, edited 1 time in total.