running VASP with openmpi in /state/partition1 (scratch)

Message

dubbelda · #1 Post by **dubbelda** » Tue Aug 09, 2011 11:50 am

To avoid writing files over NFS it is customary to run code in a scratch directory (only visible to 1 node). However, with VASP I have a problem for a parallel job when doing that. At startup my SGE script creates
WORKDIR=/state/partition1/$USER/$JOB_NAME-${JOB_ID%%.*}
as the working directory, and runs from there.
1) many MPI version require that you create this directory also on ALL other nodes (because MPI switches to this directory, even when no files are read). I use "open-mpi 1.4.3".
2) apparently, VASP requires the same input to be present at ALL the nodes (not just the start-up node).
3) okay, now it runs, but it hangs after a few hours.

If I run the same from my home-directory, which is visible via NFS, it works. The difference is that now all nodes see the _same_ (updated) files in the directory.

Question: is VASP designed to run such that all nodes need to see the same files? (like OUTCAR, CHG, CONTCAR, IBZKPT), or should you be able to run from a local scratch directory?

David

alex · #2 Post by **alex** » Tue Aug 09, 2011 5:34 pm

Hi David,

to answer your question: try it! It needs at least the input files replicated.

Hint: VASP does not write (many) huge files, so you could easily run it via NFS. Just switch off WAVECAR and CHG* writing by default and you'll end up with a bunch of Mega(!)bytes per (optimisation) run.

Cheers,

alex

dubbelda · #3 Post by **dubbelda** » Wed Aug 10, 2011 12:17 pm

Thanks Alex, I did try and it randomly hangs after a while. Although with longer tests also my jobs via the home hang eventually, so it is _not_ related to the scratch disk. That part works! (as long as you copy the input to all the nodes). But as you say, if VASP does not do write large temp-files then there is little use for doing this.
I am computing 'elastic constants' on a big system, but after a while (say around 200 out of the total 648 steps to numerically compute the generalized Hessian matrix) the output stop and nothing happens. Anybody got experience with this? I am now trying without any optimization compiler options. If that still does not work, then I will also try mvapich. I am running over infiniband.

David

alex · #4 Post by **alex** » Wed Aug 10, 2011 5:26 pm

Hi David,

this sounds weird. I'm still going with my stone age openmpi 1.2.6 over IB, because I had diffulties with 1.4.3. Which one, I can't remember.

The randomness suggests IMO network problems. Is it a professionally setup system? Are you sure you've installed proper drivers and brought the network up accordingly?

Do long optimsations show similar misbehaviour? Or is just the freq-calc.?

Which VASP version are you using?

Cheers,

alex

dubbelda · #5 Post by **dubbelda** » Wed Aug 10, 2011 5:46 pm

details: VASP 5.2.11, rocks 5.3 system, Opteron 6164 HE (24 cores per node), intel composerxe-2011.4.191 compiler, mkl, scaLAPCK, fftw using mkl. Infiniband using OFED-1.5.3.
Rocks has an old gfortran compiler (4.1) and I was unable to get a proper executable using gfortran and 'acml'. Then tried gfortran 4.4, installed it, same thing. Actually, most stuff using 'acml' crashes for me after a while (segmentation faults). MKL seems to be more stable for me (eventhough I have AMD opterons).

Long geometry optimizations are fine. The 'elastic constants' stuff takes several days and I did not yet find out what is causing the hangs. A 'top' shows everything runnning at 100% cpu per core, but there is just no output anymore.
Perhaps it is an MPI thing. I will try to run using ethernet instead of infiniband and see if that works.

David

dubbelda · #6 Post by **dubbelda** » Fri Aug 12, 2011 1:27 pm

I got no hangs when I use (in the INCAR):
LSCALAPACK = .FALSE.
so the problem seems to be related to using the intel scalapack. But it could still be related to VASP or openmpi.
In my makefile:

CPP = $(CPP_) -DMPI -DHOST=\"LinuxIFC\" -DIFC \
-Dkind8 -DNGZhalf -DCACHE_SIZE=8000 -DPGF90 -Davoidalloc \
-DMPI_BLOCK=500 -DscaLAPACK

MKLINCLUDE=/share/apps/intel/mkl/include/fftw
MKLPATH=/share/apps/intel/mkl/lib/intel64
BLAS=-L${MKLPATH} -I${MKLINCLUDE} -I${MKLINCLUDE}/em64t/lp64 -lmkl_blas95_lp64 -Wl,--start-group ${MKLPATH}/libmkl_intel_lp64.a ${MKLPATH}/libmkl_sequential.a ${MKLPATH}/libmkl_core.a -Wl,--end-group -lpthread
LAPACK=-L${MKLPATH} -I${MKLINCLUDE} -I${MKLINCLUDE}/em64t/lp64 -lmkl_lapack95_lp64 -Wl,--start-group ${MKLPATH}/libmkl_intel_lp64.a ${MKLPATH}/libmkl_sequential.a ${MKLPATH}/libmkl_core.a -Wl,--end-group -lpthread

SCA=${MKLPATH}/libmkl_scalapack_lp64.a ${MKLPATH}/libmkl_solver_lp64_sequential.a -Wl,--start-group ${MKLPATH}/libmkl_intel_lp64.a ${MKLPATH}/libmkl_sequential.a ${MKLPATH}/libmkl_core.a ${MKLPATH}/libmkl_blacs_openmpi_lp64.a -Wl,--end-group -lpthread -lpthread -limf -lm

FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o /share/apps/intel/mkl/interfaces/fftw3xf/libfftw3xf_intel.a fft3dlib.o

Would this be correct?

David

dubbelda · #7 Post by **dubbelda** » Wed Aug 17, 2011 11:50 pm

Dooh, I think most of the problems are caused by a faulty infiniband card on one of the nodes... ibcheckerrors and ibchecknet now show errors for this node.

David

dubbelda · #8 Post by **dubbelda** » Sat Apr 28, 2012 1:20 pm

Just wanted to post an update: for me, upgrading from openmpi-1.4.3 (from OFED-1.5.4) to openmpi-1.4.5 solved all my problems:
1) segmentation faults when running on more than 1 node,
2) segmentation faults for certain NPAR/NSIM values,
3) random hangs,
4) empty output-files.
Since this update VASP 5.2.12 is running great (I am running systems with 700 atoms on intel Xeon X5675 with lots of memory using infiniband Mellanox MT26428 ConnectX, linux kernel 2.6.32-220.13.1.el6, rocks).