VASP 4.6 refuses to run on more then one host

Message

karl.vollmer · #1 Post by **karl.vollmer** » Mon Apr 05, 2010 12:34 pm

When running the following on 8 core boxes (Should cause Vasp to span 2 physical compute nodes)

mpirunÂ -vÂ -npÂ 16Â -machinefileÂ $TMPDIR/machinesÂ /share/apps/vasp/4.6/vaspÂ >Â v.out

I always end up with 16 processes on the first node and 0 on the second. Compiled with GFortran. There are no errors during compliation and the output file appears to correctly indicate that vasp is running in parallel. This fails when running mpirun directly and when using SGE version: GE 6.2u4. MPI on the machine has been verified using the MPIRING and transfering 500Mb between multiple nodes without error.

I haven't worried about any optomizations yet. Step one is to get it running then I'll worry about making it faster. Any guidance would be appreciated. I've run though these forums and can't find any references to similar issues.

::Environment info::
output from vasp

Code: Select all

Â runningÂ onÂ Â Â 16Â nodes
Â distr:Â Â oneÂ bandÂ onÂ Â Â Â 1Â nodes,Â Â Â 16Â groups
Â vasp.4.6.36Â 17Feb09Â complexÂ

linked libs

Code: Select all

lddÂ vasp
Â Â Â Â Â Â Â Â liblapack.so.3Â =>Â /usr/lib64/liblapack.so.3Â (0x00002b1c33292000)
Â Â Â Â Â Â Â Â libblas.so.3Â =>Â /usr/lib64/libblas.so.3Â (0x00002b1c3399b000)
Â Â Â Â Â Â Â Â libmpi.so.0Â =>Â /opt/SUNWhpc/HPC8.2.1/gnu/lib/lib64/libmpi.so.0Â (0x00002b1c33bef000)
Â Â Â Â Â Â Â Â libopen-rte.so.0Â =>Â /opt/SUNWhpc/HPC8.2.1/gnu/lib/lib64/libopen-rte.so.0Â (0x00002b1c33d96000)
Â Â Â Â Â Â Â Â libopen-pal.so.0Â =>Â /opt/SUNWhpc/HPC8.2.1/gnu/lib/lib64/libopen-pal.so.0Â (0x00002b1c33ee3000)
Â Â Â Â Â Â Â Â libnsl.so.1Â =>Â /lib64/libnsl.so.1Â (0x000000380f600000)
Â Â Â Â Â Â Â Â librt.so.1Â =>Â /lib64/librt.so.1Â (0x000000380d600000)
Â Â Â Â Â Â Â Â libgfortran.so.1Â =>Â /usr/lib64/libgfortran.so.1Â (0x00002b1c3405b000)
Â Â Â Â Â Â Â Â libm.so.6Â =>Â /lib64/libm.so.6Â (0x000000380c600000)
Â Â Â Â Â Â Â Â libdl.so.2Â =>Â /lib64/libdl.so.2Â (0x000000380ca00000)
Â Â Â Â Â Â Â Â libutil.so.1Â =>Â /lib64/libutil.so.1Â (0x000000381a400000)
Â Â Â Â Â Â Â Â libpthread.so.0Â =>Â /lib64/libpthread.so.0Â (0x000000380ce00000)
Â Â Â Â Â Â Â Â libmpi_f77.so.0Â =>Â /opt/SUNWhpc/HPC8.2.1/gnu/lib/lib64/libmpi_f77.so.0Â (0x00002b1c342f4000)
Â Â Â Â Â Â Â Â libmpi_f90.so.0Â =>Â /opt/SUNWhpc/HPC8.2.1/gnu/lib/lib64/libmpi_f90.so.0Â (0x00002b1c34427000)
Â Â Â Â Â Â Â Â libgcc_s.so.1Â =>Â /lib64/libgcc_s.so.1Â (0x0000003819c00000)
Â Â Â Â Â Â Â Â libc.so.6Â =>Â /lib64/libc.so.6Â (0x000000380c200000)
Â Â Â Â Â Â Â Â /lib64/ld-linux-x86-64.so.2Â (0x000000380be00000)

makefile

Code: Select all

.SUFFIXES:Â .incÂ .fÂ .f90Â .F
#-----------------------------------------------------------------------
#Â MakefileÂ forÂ gf90Â compiler
#Â ThisÂ makefileÂ hasÂ notÂ beenÂ testedÂ byÂ theÂ vaspÂ crew.Â 
#Â ItÂ isÂ suppliedÂ asÂ is.
#-----------------------------------------------------------------------
#
#Â MindÂ thatÂ someÂ LinuxÂ distributionsÂ (SuseÂ 6.1)Â haveÂ aÂ bugÂ inÂ 
#Â libmÂ causingÂ smallÂ errorsÂ inÂ theÂ error-functionÂ (totalÂ energy
#Â isÂ thereforeÂ wrongÂ byÂ aboutÂ 1meV/atom).Â TheÂ recommended
#Â solutionÂ isÂ toÂ updateÂ libc.
#
#Â MindÂ thatÂ someÂ LinuxÂ distributionsÂ (SuseÂ 6.1)Â haveÂ aÂ bugÂ in
#Â libmÂ causingÂ smallÂ errorsÂ inÂ theÂ error-functionÂ (totalÂ energy
#Â isÂ thereforeÂ wrongÂ byÂ aboutÂ 1meV/atom).Â TheÂ recommended
#Â solutionÂ isÂ toÂ updateÂ libc.
#
#Â BLASÂ mustÂ beÂ installedÂ onÂ theÂ machine
#Â thereÂ areÂ severalÂ options:
#Â 1)Â veryÂ slowÂ butÂ works:
#Â Â Â retrieveÂ theÂ lapackageÂ fromÂ ftp.netlib.org
#Â Â Â andÂ compileÂ theÂ blasÂ routinesÂ (BLAS/SRCÂ directory)
#Â Â Â pleaseÂ useÂ g77Â orÂ f77Â forÂ theÂ compilation.Â WhenÂ IÂ triedÂ to
#Â Â Â useÂ pgf77Â orÂ pgf90Â forÂ BLAS,Â VASPÂ hangÂ upÂ whenÂ calling
#Â Â Â ZHEEVÂ Â (howeverÂ thisÂ wasÂ withÂ lapackÂ 1.1Â nowÂ IÂ useÂ lapackÂ 2.0)
#Â 2)Â mostÂ desirable:Â getÂ anÂ optimizedÂ BLAS
#Â Â Â forÂ aÂ listÂ ofÂ optimizedÂ BLASÂ try
#Â Â Â Â Â http://www.kachinatech.com/~hjjou/scilib/opt_blas.html
#
#Â theÂ twoÂ mostÂ reliableÂ packagesÂ aroundÂ areÂ presently:
#Â 3a)Â IntelsÂ ownÂ optimisedÂ BLASÂ (PIII,Â P4,Â Itanium)
#Â Â Â Â Â http://developer.intel.com/software/products/mkl/
#Â Â Â thisÂ isÂ reallyÂ excellentÂ whenÂ youÂ useÂ IntelÂ CPU's
#
#Â 3b)Â orÂ obtainÂ theÂ atlasÂ basedÂ BLASÂ routines
#Â Â Â Â Â http://math-atlas.sourceforge.net/
#Â Â Â youÂ certainlyÂ needÂ atlasÂ onÂ theÂ Athlon,Â sinceÂ theÂ Â mkl
#Â Â Â routinesÂ areÂ notÂ optimalÂ onÂ theÂ Athlon.
#
#-----------------------------------------------------------------------

#Â allÂ CPPÂ processedÂ fortranÂ filesÂ haveÂ theÂ extensionÂ .fÂ 
SUFFIX=.f

#-----------------------------------------------------------------------
#Â fortranÂ compilerÂ andÂ linker
#-----------------------------------------------------------------------
FC=gfortran
#Â fortranÂ linker
FCL=$(FC)

#-----------------------------------------------------------------------
#Â whereisÂ CPPÂ ??Â (IÂ needÂ CPP,Â can'tÂ useÂ gccÂ withÂ properÂ options)
#Â that'sÂ theÂ locationÂ ofÂ gccÂ forÂ SUSEÂ 5.3
#
#Â Â CPP_Â Â Â =Â Â /usr/lib/gcc-lib/i486-linux/2.7.2/cppÂ -PÂ -CÂ 
#
#Â that'sÂ probablyÂ theÂ rightÂ lineÂ forÂ someÂ RedÂ HatÂ distribution:
#
#Â Â CPP_Â Â Â =Â Â /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/cppÂ -PÂ -C
#
#Â Â SUSEÂ 6.X,Â maybeÂ someÂ RedÂ HatÂ distributions:

CPP_Â =Â Â ./preprocessÂ <$*.FÂ |Â /usr/bin/cppÂ -PÂ -CÂ -traditionalÂ >$*$(SUFFIX)

#-----------------------------------------------------------------------
#Â possibleÂ optionsÂ forÂ CPP:
#Â possibleÂ optionsÂ forÂ CPP:
#Â NGXhalfÂ Â Â Â Â Â Â Â Â Â Â Â Â chargeÂ densityÂ Â Â reducedÂ inÂ XÂ direction
#Â wNGXhalfÂ Â Â Â Â Â Â Â Â Â Â Â gammaÂ pointÂ onlyÂ reducedÂ inÂ XÂ direction
#Â avoidallocÂ Â Â Â Â Â Â Â Â Â avoidÂ ALLOCATEÂ ifÂ possible
#Â IFCÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â workÂ aroundÂ someÂ IFCÂ bugs
#Â CACHE_SIZEÂ Â Â Â Â Â Â Â Â Â 1000Â forÂ PII,PIII,Â 5000Â forÂ Athlon,Â 8000Â P4
#Â RPROMU_DGEMVÂ Â Â Â Â Â Â Â useÂ DGEMVÂ insteadÂ ofÂ DGEMMÂ inÂ RPROÂ (usuallyÂ Â faster)
#Â RACCMU_DGEMVÂ Â Â Â Â Â Â Â useÂ DGEMVÂ insteadÂ ofÂ DGEMMÂ inÂ RACCÂ (fasterÂ onÂ P4)
#Â Â ****Â definitelyÂ useÂ -DRACCMU_DGEMVÂ ifÂ youÂ useÂ theÂ mklÂ library
#-----------------------------------------------------------------------

CPPÂ Â Â Â =Â $(CPP_)Â -DHOST=\"LinuxGfortran\"Â \
Â Â Â Â Â Â Â Â Â Â -Dkind8Â -DNGXhalfÂ -DCACHE_SIZE=8000Â -DGfortranÂ -DavoidallocÂ \
Â Â Â Â Â Â Â Â Â Â -DRPROMU_DGEMV

#-----------------------------------------------------------------------
#Â generalÂ fortranÂ flagsÂ Â (thereÂ mustÂ aÂ trailingÂ blankÂ onÂ thisÂ line)
#Â theÂ -Mx,119,0x200000Â isÂ requiredÂ ifÂ youÂ useÂ olderÂ pgf90Â versions
#Â onÂ aÂ moreÂ recentÂ LINUXÂ installation
#Â theÂ optionÂ willÂ notÂ doÂ anyÂ harmÂ onÂ otherÂ 3.XÂ pgf90Â distributions
#-----------------------------------------------------------------------

FFLAGSÂ =Â Â -ffree-formÂ -ffree-line-length-none

#-----------------------------------------------------------------------
#Â optimization,
#Â weÂ haveÂ testedÂ whetherÂ higherÂ optimisationÂ improves
#Â theÂ performance,Â andÂ foundÂ noÂ improvementsÂ withÂ -O3-5Â orÂ -fast
#Â (evenÂ onÂ AthlonÂ system,Â AthlonÂ specificÂ optimistationÂ worsensÂ performance)
#-----------------------------------------------------------------------

OFLAGÂ Â =Â -O2

OFLAG_HIGHÂ =Â $(OFLAG)
OBJ_HIGHÂ =
OBJ_NOOPTÂ =
DEBUGÂ Â =Â -gÂ -O0
INLINEÂ =Â $(OFLAG)
#-----------------------------------------------------------------------
#Â theÂ followingÂ linesÂ specifyÂ theÂ positionÂ ofÂ BLASÂ Â andÂ LAPACK
#Â whatÂ youÂ choseÂ isÂ veryÂ systemÂ dependent
#Â P4:Â VASPÂ worksÂ fastestÂ withÂ IntelsÂ mklÂ performanceÂ library
#Â Athlon:Â AtlasÂ basedÂ BLASÂ areÂ presentlyÂ theÂ fastest
#Â P3:Â noÂ clue
#-----------------------------------------------------------------------

#Â AtlasÂ basedÂ libraries
ATLASHOME=Â /usr/local/atlas/lib
#BLAS=Â Â Â -L/usr/local/atlas/libÂ -lblas
BLAS=Â Â Â -L$(ATLASHOME)Â Â -lf77blasÂ -latlas

#Â useÂ specificÂ librariesÂ (defaultÂ libraryÂ pathÂ pointsÂ toÂ otherÂ libraries)
BLAS=Â $(ATLASHOME)/libf77blas.aÂ $(ATLASHOME)/libatlas.a

#Â useÂ theÂ mklÂ IntelÂ librariesÂ forÂ p4Â (www.intel.com)
#BLAS=-L/opt/intel/mkl/lib/32Â -lmkl_p4Â Â -lpthread

#Â LAPACK,Â simplestÂ useÂ vasp.4.lib/lapack_double
LAPACK=Â ../vasp.4.lib/lapack_double.o

#Â useÂ atlasÂ optimizedÂ partÂ ofÂ lapack
#LAPACK=Â ../vasp.4.lib/lapack_atlas.oÂ Â -llapackÂ -lblas

#Â useÂ theÂ mklÂ IntelÂ lapack
#LAPACK=Â -lmkl_lapack

#LAPACK=Â -L/usr/local/atlas/libÂ -llapack

#-----------------------------------------------------------------------

LIBÂ Â =Â -L../vasp.4.libÂ -ldmyÂ \
Â Â Â Â Â ../vasp.4.lib/linpack_double.oÂ $(LAPACK)Â \
Â Â Â Â Â $(BLAS)

#Â optionsÂ forÂ linkingÂ (noneÂ required)
LINKÂ Â Â Â =

#-----------------------------------------------------------------------
#Â fftÂ libraries:
#Â VASP.4.5Â canÂ useÂ FFTWÂ (http://www.fftw.org)
#Â sinceÂ theÂ FFTWÂ isÂ veryÂ slowÂ forÂ radicesÂ 2^nÂ theÂ fft3dlibÂ isÂ used
#Â inÂ theseÂ cases
#Â ifÂ youÂ useÂ fftw3dÂ youÂ needÂ toÂ insertÂ -lfftwÂ inÂ theÂ LIBÂ lineÂ asÂ well
#Â pleaseÂ doÂ notÂ sendÂ usÂ anyÂ querriesÂ reltatedÂ toÂ FFTWÂ (noÂ support)
#Â ifÂ itÂ fails,Â useÂ fft3dlib
#-----------------------------------------------------------------------

FFT3DÂ Â Â =Â fft3dfurth.oÂ fft3dlib.o
#FFT3DÂ Â Â =Â fftw3d+furth.oÂ fft3dlib.o
FC=mpif90
FCL=$(FC)

#-----------------------------------------------------------------------
#Â additionalÂ optionsÂ forÂ CPPÂ inÂ parallelÂ versionÂ (seeÂ alsoÂ above):
#Â NGZhalfÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â chargeÂ densityÂ Â Â reducedÂ inÂ ZÂ direction
#Â wNGZhalfÂ Â Â Â Â Â Â Â Â Â Â Â Â Â gammaÂ pointÂ onlyÂ reducedÂ inÂ ZÂ direction
#Â scaLAPACKÂ Â Â Â Â Â Â Â Â Â Â Â Â useÂ scaLAPACKÂ (usuallyÂ slowerÂ onÂ 100Â MbitÂ Net)
#-----------------------------------------------------------------------

CPPÂ Â Â Â =Â $(CPP_)Â -DMPIÂ Â -DHOST=\"LinuxPgi\"Â \
Â Â Â Â Â -Dkind8Â -DNGZhalfÂ -DCACHE_SIZE=8000Â -DPGF90Â -DavoidallocÂ -DRPROMU_DGEMV

#-----------------------------------------------------------------------
#Â locationÂ ofÂ SCALAPACK
#Â ifÂ youÂ doÂ notÂ useÂ SCALAPACKÂ simplyÂ uncommentÂ theÂ lineÂ SCA
#-----------------------------------------------------------------------

BLACS=/usr/local/BLACS_lam
SCA_=Â /usr/local/SCALAPACK_lam

SCA=Â $(SCA_)/scalapack_LINUX.aÂ $(SCA_)/pblas_LINUX.aÂ $(SCA_)/tools_LINUX.aÂ \
Â $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.aÂ $(BLACS)/LIB/blacs_MPI-LINUX-0.aÂ $(BLACS)/LIB/blacsF77init_MPI-LINUX-0.a

SCA=

#-----------------------------------------------------------------------
#Â librariesÂ forÂ mpi
#-----------------------------------------------------------------------

LIBÂ Â Â Â Â =Â -L../vasp.4.libÂ -ldmyÂ Â \
Â Â Â Â Â Â ../vasp.4.lib/linpack_double.oÂ $(LAPACK)Â \
Â Â Â Â Â Â $(SCA)Â $(BLAS)

#Â FFT:Â onlyÂ optionÂ Â fftmpi.oÂ withÂ fft3dlibÂ ofÂ JuergenÂ Furthmueller

FFT3DÂ Â Â =Â fftmpi.oÂ fftmpi_map.oÂ fft3dlib.o

karl.vollmer · #2 Post by **karl.vollmer** » Mon Apr 05, 2010 1:04 pm

Naturally right after I post I figure it out. Looks like it was due to a version change of OpenMPI causing it to not ignore the old MPICH style mpirun command in the users jobs. This happens with OpenMPI >1.3.4 confirmed so far. I'm sure there's something in their changelog that reflects this.