Page 1 of 1
parallel vasp.4.6 on Opteron
Posted: Tue Dec 28, 2004 4:59 am
by guest
I try to install VASP 4.6 on an Opteron cluster using PGI 5.2 and the libgoto library.The serial version compiles and runs fine.
However the parallel version (using mpich1.2.1 or 1.2.6) fails to run, giving the following error:
running on 1 nodes
0 - MPI_CART_CREATE : Invalid topology
[0] Aborting program !
[0] Aborting program!
p0_4921: p4_error: : 10
Has anyone been succesful compiling parallel VASP on an Opteron cluster?
Thanks, Mark
parallel vasp.4.6 on Opteron
Posted: Mon Jun 27, 2005 5:52 pm
by matlgen
I had this exact same issue: Compiling and running a serial version of vasp on our Opteron cluster worked fine. The parallel version gave the same error as yours. I did some modifying of the Makefile, and a change that seemed to work for me was to change the FFLAGS. Our Opterons are running 32-bit right now (not our choice), are yours?
Anyway, the solution seems to be changing the FFLAGS to be:
FFLAGS=-Mfree -tp k8-32 -i4
I did this on both of the makefiles (ie the one in vasp.4.lib and vasp.4.6). The -tp k8-32 means that we are running in 32-bit mode. The -i4 means that INTEGERs are 4 bytes. I suspect that an issue could be mpich was compiled with 4 byte integers.
Let us know if that solves your problem, or if you solved it another way.
David
parallel vasp.4.6 on Opteron
Posted: Thu Jul 07, 2005 5:20 am
by saurabh
Hello,
I am also facing the same problem on Opteron machines. I have already installed 64 bit OS on all the machines. Also I changed the Makefile according to Dave, but that also is giving problems when linking. Can you suggest a way for running VASP parallely on our Opteron cluster, using the 64 bit OS itself. I am using PG Cluster Development Kit 5.2, lbgoto-opt64 library and fftw-3.0.1. What changes I have to make in the Makefile to compile vasp parallely on the 64-bit OS. Please help
Regards,
Saurabh
parallel vasp.4.6 on Opteron
Posted: Thu Jul 07, 2005 6:37 am
by saurabh
Hello all,
And one more thing. I have used -i4 alone keeping the other command line switches in FFLAGS intact. Also I have made the same change in vasp.4.lib. Now the previous error is not coming. The error it hsows now is:
*****************************************
POSCAR, INCAR and KPOINTS ok, starting setup
FFT: planning ... 1
p0_27854: p4_error: interrupt SIGSEGV: 11
Killed by signal 2.
/usr/pgi/linux86-64/5.2/bin/mpirun: line 1: 27854 Broken pipe /home/iacs/vasp/vasp.4.6/vasp -p4pg /home/iacs/TEST/test2/mailtoby/tial2/PI27770 -p4wd /home/iacs/TEST/test2/mailtoby/tial2
******************************************
Please tell me what is going wrong here. Thanx in advance...
Regards
Saurabh
parallel vasp.4.6 on Opteron
Posted: Tue Jan 24, 2006 2:29 pm
by cheathturner
Hello,
Has anyone been able to resolve this issue? I am having the same difficulties. The serial version compiles fine (PGF90 compiler, v.5.2, AMD Opteron, 64-bit SuSE linux). However, after compiling with mpi (version 1.2.5.2) and running with 2 procs, I get the same message:
0 - MPI_CART_CREATE : Invalid topology
[0] Aborting program !
[0] Aborting program!
p0_2981: p4_error: : 10
Killed by signal 2.
If I use the '-i4' flag, I also get a message similar to saurabh:
running on 2 nodes
distr: one band on 1 nodes, 2 groups
vasp.4.6.28 25Jul05 complex
POSCAR found : 1 types and 50 ions
p0_5651: p4_error: interrupt SIGSEGV: 11
Killed by signal 2.
Any suggestions??
Heath
parallel vasp.4.6 on Opteron
Posted: Tue Jan 24, 2006 7:51 pm
by tjf
I would suggest getting another MPI library (such as OpenMPI) and building that on your machine, then rebuilding VASP. Or, indeed, rebuild MPICH yourself (I assume you're using a prebuilt library) so that you're sure what's been built and what it's been built with.
parallel vasp.4.6 on Opteron
Posted: Thu Jun 08, 2006 11:25 pm
by applelinux
i have the same problem while compiling parallel vasp. It was solved by setting FFLAGS=-Mfree -tp k8-64 -i8 to FFLAGS=-Mfree -tp k8-64 -i4 . It looks like the default flag for the integer is 4 bytes.
parallel vasp.4.6 on Opteron
Posted: Tue Jun 13, 2006 6:50 am
by admin
The problem seems to be the length of the integers in the MPICH installation. Please check the byte-length of integer numbers in your MPICH installation. VASP then has to be compiled with the same byte-length for integer numbers.
parallel vasp.4.6 on Opteron
Posted: Wed Jul 05, 2006 1:18 pm
by gollum
I have same problem. and I use both "FFLAGS=-Mfree -tp k8-64 -i8" and "FFLAGS=-Mfree -tp k8-64 -i4 " for each time. But problem is not solved. How can i check a byte-length for integer number?
parallel vasp.4.6 on Opteron
Posted: Sat Aug 05, 2006 5:20 am
by DavidCGreen
Hi Gollum
I think you just need to look in the right header file(s) for your MPI installation.
(you'll probably need the LAM/MPICH/OpenMPI "development" package installed before you can find it)
For LAM, use the grep command and look for LAM_SIZEOF_ lines in the lam_config.h header file.
On Fedora Core 3 (64 bit) the file is /usr/include/lam_config.h
On Ubuntu (32 bit Dapper) the file is /usr/lib/lam/include/lam_config.h
I'm looking to install VASP on a Sun Microsytems V20z (Opteron) cluster and using it in parallel mode via MPI will be essential.
Cheers
David
<span class='smallblacktext'>[ Edited Sat Aug 05 2006, 07:30AM ]</span>