Parallel Vasp successfully compiled (AMD x86_64, 4 core, OpenMPI, Blas, Intel-Fortran-Comp.)

Message

the_big_guy · #1 Post by **the_big_guy** » Sun Apr 25, 2010 10:25 pm

I found Meister Krause's post on his successful build helpful, so I thought I would post info on my successful build as well. Hopefully it will save someone from spending the many, many hours it took me to compile.

system:
- 4 x AMD Opteron(tm) Processor 850
- Arch Linux (X86-64)
- Intel Fortran Compiler 11.1
- Gnu C/C++ Compiler 4.5.0
- OpenMPI 1.4.1
- Vasp 5.2

steps:
- install the Intel Fortran Compiler (ifort)
- install OpenMPI
- build the fftw 3.x Fortran wrapper library
- install the Blas and Lapack libraries
- build the VASP libraries
- build VASP

ifort
The Intel Fortran Compiler works well but requires lib32-gcc even though I am compiling the 64 bit version. Without it the installation fails without giving any error messages, so it took me a while to figure out the problem.

OpenMPI
OpenMPI must be compiled with ifort as the fortran compiler because gfortran, and g95 both seem unable to compile VASP as of now. The process will be as easy as a simple substitution of 'ifort' for 'gfortran' or 'g95' in the make file. to keep my system organized I used the PKGBUILD available in the Arch Linux User Repository (AUR) and changed gfortran to ifort.

fftw 3.x
After many attempts at compiling VASP I stumbled onto this article on Intel's website. I followed all of their advice except for one slight deviation. I do not have the Intel C compiler so I used the Gnu Compiler.

Code: Select all

makeÂ libem64tÂ compiler=gnu

The resulting binary worked well for me. I should note that the assumed location of the file on their website was "/opt/intel/mkl/10.2.0.013/interfaces/fftw3xf" but the default location is actually "/opt/intel/Compiler/11.1/072/mkl/interfaces/fftw3xf" for the current compiler, you will have to find yours.

Blas and Lapack
Not much for me to do here, I just used the Blas and Lapack libraries from the regular Arch Linux repositories.

VASP Libraries
I edited the FC line of the makefile.linux_efc_itanium file slightly to use open mpi in conjunction with ifort

Code: Select all

.SUFFIXES:Â .incÂ .fÂ .F

CPPÂ Â Â Â Â =Â gccÂ -EÂ -PÂ -CÂ $*.FÂ >$*.f
FC=mpiifort

CFLAGSÂ =Â -O
FFLAGSÂ =Â -O1Â -FI
FREEÂ Â Â =Â Â -FR

DOBJÂ =Â Â preclib.oÂ timing_.oÂ derrf_.oÂ dclock_.oÂ Â diolib.oÂ dlexlib.oÂ drdatab.o


#-----------------------------------------------------------------------
#Â generalÂ rules
#-----------------------------------------------------------------------

libdmy.a:Â $(DOBJ)Â lapack_double.oÂ linpack_double.oÂ lapack_atlas.o
	-rmÂ libdmy.a
	arÂ vqÂ libdmy.aÂ $(DOBJ)

#Â filesÂ whichÂ doÂ notÂ requireÂ autodoubleÂ 
lapack_min.o:Â lapack_min.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ lapack_min.f
lapack_double.o:Â lapack_double.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ lapack_double.f
lapack_single.o:Â lapack_single.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ lapack_single.f
lapack_atlas.o:Â lapack_atlas.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ lapack_atlas.f
linpack_double.o:Â linpack_double.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ linpack_double.f
linpack_single.o:Â linpack_single.f
	$(FC)Â $(FFLAGS)Â $(NOFREE)Â -cÂ linpack_single.f

.c.o:
	$(CC)Â $(CFLAGS)Â -cÂ $*.c
.F.o:
	$(CPP)Â 
	$(FC)Â $(FFLAGS)Â $(FREE)Â $(INCS)Â -cÂ $*.f
.F.f:
	$(CPP)Â 
.f.o:
	$(FC)Â $(FFLAGS)Â $(FREE)Â $(INCS)Â -cÂ $*.f

VASP
I once again edited the makefile.linux_efc_itanium file to suit my needs. The changes I made were
- set the fortran compiler to mpiifort
- change the fortran flags line to match that on Intel's website as mentioned above
- add -heap-arrays to FFLAGS to avoid segfaults (as per this forum post)
- change the BLAS, LAPACK and FFT3D lines to search the proper location for the libraries they need
- uncomment a few lines in the mpi section

Code: Select all

.SUFFIXES:Â .incÂ .fÂ .f90Â .F
SUFFIX=.f90

FFLAGSÂ =Â Â -I/opt/intel/Compiler/11.1/072/mkl/include/fftwÂ -FRÂ -lowercaseÂ -assumeÂ bytereclÂ -ftzÂ -heap-arrays

#-----------------------------------------------------------------------
#Â optimization
#Â -O3Â seemsÂ best
#-----------------------------------------------------------------------

OFLAG=-O3

OFLAG_HIGHÂ =Â $(OFLAG)
OBJ_HIGHÂ =Â 
OBJ_NOOPTÂ =Â 
DEBUGÂ Â =Â -FRÂ -O0
INLINEÂ =Â $(OFLAG)


#-----------------------------------------------------------------------
#Â theÂ followingÂ linesÂ specifyÂ theÂ positionÂ ofÂ BLASÂ Â andÂ LAPACK
#-----------------------------------------------------------------------

BLAS=Â -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t/Â -lmkl_intel_lp64Â -lmkl_sequentialÂ -lmkl_coreÂ -lpthread

#Â useÂ theÂ mklÂ IntelÂ lapack
LAPACK=Â -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t/Â -lmkl_lapackÂ -lmkl_intel_lp64Â -lmkl_sequentialÂ -lmkl_coreÂ -lpthread

#-----------------------------------------------------------------------

LIBÂ Â =Â -L../vasp.5.libÂ -ldmyÂ \
Â Â Â Â Â ../vasp.5.lib/linpack_double.oÂ $(LAPACK)Â \
Â Â Â Â Â $(BLAS)

#Â optionsÂ forÂ linkingÂ (forÂ compilerÂ versionÂ 6.X)Â nothingÂ isÂ required
LINKÂ Â Â Â =Â Â 

#=======================================================================
#Â MPIÂ section
#
#Â theÂ systemÂ weÂ usedÂ isÂ anÂ SGIÂ testÂ system,Â andÂ itÂ isÂ best
#Â toÂ compileÂ usingÂ ifortÂ andÂ addingÂ theÂ optionÂ -lmpiÂ during
#Â linking
#=======================================================================

FC=mpiifort
FCL=$(FC)

#-----------------------------------------------------------------------
#Â additionalÂ optionsÂ forÂ CPPÂ inÂ parallelÂ versionÂ (seeÂ alsoÂ above):
#Â NGZhalfÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â chargeÂ densityÂ Â Â reducedÂ inÂ ZÂ direction
#Â wNGZhalfÂ Â Â Â Â Â Â Â Â Â Â Â Â Â gammaÂ pointÂ onlyÂ reducedÂ inÂ ZÂ direction
#Â scaLAPACKÂ Â Â Â Â Â Â Â Â Â Â Â Â useÂ scaLAPACKÂ (usuallyÂ slowerÂ onÂ 100Â MbitÂ Net)
#-----------------------------------------------------------------------

CPPÂ Â Â Â =Â $(CPP_)Â -DMPIÂ Â -DHOST=\"LinuxIFCmkl\"Â -DIFCÂ \
Â Â Â Â Â -Dkind8Â -DCACHE_SIZE=4000Â -DPGF90Â -DavoidallocÂ \
Â Â Â Â Â -DMPI_BLOCK=8000Â \
Â Â Â Â Â -DRPROMU_DGEMVÂ Â -DRACCMU_DGEMV

SCA=

#-----------------------------------------------------------------------
#Â librariesÂ forÂ mpi
#-----------------------------------------------------------------------

LIBÂ Â Â Â Â =Â -L../vasp.5.libÂ -ldmyÂ Â \
Â Â Â Â Â Â ../vasp.5.lib/linpack_double.oÂ $(LAPACK)Â \
Â Â Â Â Â Â $(SCA)Â $(BLAS)Â \
Â Â Â Â Â Â -lmpi

FFT3DÂ Â Â =Â fftmpi.oÂ fftmpi_map.oÂ fftw3d.oÂ Â Â fft3dlib.oÂ Â /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libfftw3xf_gnu.a

#-----------------------------------------------------------------------
#Â generalÂ rulesÂ andÂ compileÂ lines
#-----------------------------------------------------------------------
BASIC=Â Â Â symmetry.oÂ symlib.oÂ Â Â lattlib.oÂ Â random.oÂ Â Â 

SOURCE=Â Â base.oÂ Â Â Â Â mpi.oÂ Â Â Â Â Â smart_allocate.oÂ Â Â Â Â Â xml.oÂ Â \
Â Â Â Â Â Â Â Â Â constant.oÂ jacobi.oÂ Â Â main_mpi.oÂ Â scala.oÂ Â Â \
Â Â Â Â Â Â Â Â Â asa.oÂ Â Â Â Â Â lattice.oÂ Â poscar.oÂ Â Â ini.oÂ Â Â Â Â Â Â xclib.oÂ Â Â Â Â xclib_grad.oÂ \
Â Â Â Â Â Â Â Â Â radial.oÂ Â Â pseudo.oÂ Â Â mgrid.oÂ Â Â Â gridq.oÂ Â Â Â Â ebs.oÂ Â \
Â Â Â Â Â Â Â Â Â mkpoints.oÂ wave.oÂ Â Â Â Â wave_mpi.oÂ Â wave_high.oÂ Â \
Â Â Â Â Â Â Â Â Â $(BASIC)Â Â Â nonl.oÂ Â Â Â Â nonlr.oÂ Â Â Â nonl_high.oÂ dfast.oÂ Â Â Â choleski2.oÂ \
Â Â Â Â Â Â Â Â Â mix.oÂ Â Â Â Â Â hamil.oÂ Â Â Â xcgrad.oÂ Â Â xcspin.oÂ Â Â Â potex1.oÂ Â Â potex2.oÂ Â \
Â Â Â Â Â Â Â Â Â metagga.oÂ constrmag.oÂ cl_shift.oÂ relativistic.oÂ LDApU.oÂ \
Â Â Â Â Â Â Â Â Â paw_base.oÂ egrad.oÂ Â Â Â pawsym.oÂ Â Â pawfock.oÂ Â pawlhf.oÂ Â Â Â paw.oÂ Â Â \
Â Â Â Â Â Â Â Â Â mkpoints_full.oÂ Â Â Â Â Â Â charge.oÂ Â Â dipol.oÂ Â Â Â pot.oÂ Â \
Â Â Â Â Â Â Â Â Â dos.oÂ Â Â Â Â Â elf.oÂ Â Â Â Â Â tet.oÂ Â Â Â Â Â tetweight.oÂ hamil_rot.oÂ \
Â Â Â Â Â Â Â Â Â steep.oÂ Â Â Â chain.oÂ Â Â Â dyna.oÂ Â Â Â Â sphpro.oÂ Â Â Â us.oÂ Â core_rel.oÂ \
Â Â Â Â Â Â Â Â Â aedens.oÂ Â Â wavpre.oÂ Â Â wavpre_noio.oÂ broyden.oÂ \
Â Â Â Â Â Â Â Â Â dynbr.oÂ Â Â Â rmm-diis.oÂ reader.oÂ Â Â writer.oÂ Â Â tutor.oÂ xml_writer.oÂ \
Â Â Â Â Â Â Â Â Â brent.oÂ Â Â Â stufak.oÂ Â Â fileio.oÂ Â Â opergrid.oÂ stepver.oÂ Â \
Â Â Â Â Â Â Â Â Â chgloc.oÂ Â Â fast_aug.oÂ fock.oÂ Â Â Â Â mkpoints_change.oÂ sym_grad.oÂ \
Â Â Â Â Â Â Â Â Â mymath.oÂ Â Â internals.oÂ dimer_heyden.oÂ dvvtrajectory.oÂ vdwforcefield.oÂ \
Â Â Â Â Â Â Â Â Â hamil_high.oÂ nmr.oÂ Â Â Â force.oÂ \
Â Â Â Â Â Â Â Â Â pead.oÂ Â Â Â Â subrot.oÂ Â Â subrot_scf.oÂ pwlhf.oÂ Â gw_model.oÂ optreal.oÂ Â Â davidson.oÂ \
Â Â Â Â Â Â Â Â Â electron.oÂ rot.oÂ Â electron_all.oÂ shm.oÂ Â Â Â pardens.oÂ Â paircorrection.oÂ \
Â Â Â Â Â Â Â Â Â optics.oÂ Â Â constr_cell_relax.oÂ Â Â stm.oÂ Â Â Â finite_diff.oÂ elpol.oÂ Â Â Â \
Â Â Â Â Â Â Â Â Â hamil_lr.oÂ rmm-diis_lr.oÂ Â subrot_cluster.oÂ subrot_lr.oÂ \
Â Â Â Â Â Â Â Â Â lr_helper.oÂ hamil_lrf.oÂ Â Â elinear_response.oÂ ilinear_response.oÂ \
Â Â Â Â Â Â Â Â Â linear_optics.oÂ linear_response.oÂ Â Â \
Â Â Â Â Â Â Â Â Â setlocalpp.oÂ Â wannier.oÂ electron_OEP.oÂ electron_lhf.oÂ twoelectron4o.oÂ \
Â Â Â Â Â Â Â Â Â ratpol.oÂ screened_2e.oÂ wave_cacher.oÂ chi_base.oÂ wpot.oÂ local_field.oÂ \
Â Â Â Â Â Â Â Â Â ump2.oÂ bse.oÂ acfdt.oÂ chi.oÂ sydmat.oÂ 

INC=

vasp:Â $(SOURCE)Â $(FFT3D)Â $(INC)Â main.oÂ 
	rmÂ -fÂ vasp
	$(FCL)Â -oÂ vaspÂ main.oÂ Â $(SOURCE)Â Â Â $(FFT3D)Â $(LIB)Â $(LINK)
makeparam:Â $(SOURCE)Â $(FFT3D)Â makeparam.oÂ main.FÂ $(INC)
	$(FCL)Â -oÂ makeparamÂ Â $(LINK)Â makeparam.oÂ $(SOURCE)Â $(FFT3D)Â $(LIB)
zgemmtest:Â zgemmtest.oÂ base.oÂ random.oÂ $(INC)
	$(FCL)Â -oÂ zgemmtestÂ $(LINK)Â zgemmtest.oÂ random.oÂ base.oÂ $(LIB)
dgemmtest:Â dgemmtest.oÂ base.oÂ random.oÂ $(INC)
	$(FCL)Â -oÂ dgemmtestÂ $(LINK)Â dgemmtest.oÂ random.oÂ base.oÂ $(LIB)Â 
ffttest:Â base.oÂ smart_allocate.oÂ mpi.oÂ mgrid.oÂ random.oÂ ffttest.oÂ $(FFT3D)Â $(INC)
	$(FCL)Â -oÂ ffttestÂ $(LINK)Â ffttest.oÂ mpi.oÂ mgrid.oÂ random.oÂ smart_allocate.oÂ base.oÂ $(FFT3D)Â $(LIB)
kpoints:Â $(SOURCE)Â $(FFT3D)Â makekpoints.oÂ main.FÂ $(INC)
	$(FCL)Â -oÂ kpointsÂ $(LINK)Â makekpoints.oÂ $(SOURCE)Â $(FFT3D)Â $(LIB)

clean:	
	-rmÂ -fÂ *.gÂ *.fÂ *.oÂ *.LÂ *.modÂ ;Â touchÂ *.F

main.o:Â main$(SUFFIX)
	$(FC)Â $(FFLAGS)$(DEBUG)Â Â $(INCS)Â -cÂ main$(SUFFIX)
xcgrad.o:Â xcgrad$(SUFFIX)
	$(FC)Â $(FFLAGS)Â $(INLINE)Â Â $(INCS)Â -cÂ xcgrad$(SUFFIX)
xcspin.o:Â xcspin$(SUFFIX)
	$(FC)Â $(FFLAGS)Â $(INLINE)Â Â $(INCS)Â -cÂ xcspin$(SUFFIX)

makeparam.o:Â makeparam$(SUFFIX)
	$(FC)Â $(FFLAGS)$(DEBUG)Â Â $(INCS)Â -cÂ makeparam$(SUFFIX)

makeparam$(SUFFIX):Â makeparam.FÂ main.FÂ 
#
#Â MIND:Â IÂ doÂ notÂ haveÂ aÂ fullÂ dependencyÂ listÂ forÂ theÂ include
#Â andÂ MODULES:Â hereÂ areÂ onlyÂ theÂ minimalÂ basicÂ dependencies
#Â ifÂ oneÂ strucutureÂ isÂ changedÂ thenÂ touch_depÂ mustÂ beÂ called
#Â withÂ theÂ correspondingÂ nameÂ ofÂ theÂ structure
#
base.o:Â base.incÂ base.F
mgrid.o:Â mgrid.incÂ mgrid.F
constant.o:Â constant.incÂ constant.F
lattice.o:Â lattice.incÂ lattice.F
setex.o:Â setexm.incÂ setex.F
pseudo.o:Â pseudo.incÂ pseudo.F
poscar.o:Â poscar.incÂ poscar.F
mkpoints.o:Â mkpoints.incÂ mkpoints.F
wave.o:Â wave.incÂ wave.F
nonl.o:Â nonl.incÂ nonl.F
nonlr.o:Â nonlr.incÂ nonlr.F

$(OBJ_HIGH):
	$(CPP)
	$(FC)Â $(FFLAGS)Â $(OFLAG_HIGH)Â $(INCS)Â -cÂ $*$(SUFFIX)
$(OBJ_NOOPT):
	$(CPP)
	$(FC)Â $(FFLAGS)Â $(INCS)Â -cÂ $*$(SUFFIX)

fft3dlib_f77.o:Â fft3dlib_f77.F
	$(CPP)
	$(F77)Â $(FFLAGS_F77)Â -cÂ $*$(SUFFIX)

.F.o:
	$(CPP)
	$(FC)Â $(FFLAGS)Â $(OFLAG)Â $(INCS)Â -cÂ $*$(SUFFIX)
.F$(SUFFIX):
	$(CPP)
$(SUFFIX).o:
	$(FC)Â $(FFLAGS)Â $(OFLAG)Â $(INCS)Â -cÂ $*$(SUFFIX)

#Â specialÂ rules
#-----------------------------------------------------------------------
#Â theseÂ specialÂ rulesÂ areÂ cummulativeÂ (thatÂ isÂ onceÂ failed
#Â Â Â inÂ oneÂ compilerÂ version,Â staysÂ inÂ theÂ listÂ forever)
#Â performanceÂ penalitiesÂ areÂ smallÂ however


fft3dlib.oÂ :Â fft3dlib.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O3Â -ipÂ -ftzÂ -cÂ $*$(SUFFIX)
fft3dfurth.oÂ :Â fft3dfurth.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

radial.oÂ :Â radial.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

rot.oÂ :Â rot.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

symlib.oÂ :Â symlib.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

acfdt.oÂ :Â acfdt.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

chi.oÂ :Â chi.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)
poscar.oÂ :Â poscar.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

chi_base.oÂ :Â chi_base.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

symmetry.oÂ :Â symmetry.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

pead.oÂ :Â pead.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

dynbr.oÂ :Â dynbr.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

electron_all.oÂ :Â electron_all.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

asa.oÂ :Â asa.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

broyden.oÂ :Â broyden.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O2Â -ftzÂ -cÂ $*$(SUFFIX)

us.oÂ :Â us.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O1Â -ftzÂ -cÂ $*$(SUFFIX)

LDApU.oÂ :Â LDApU.F
	$(CPP)
	$(FC)Â -FRÂ -lowercaseÂ -O2Â -ftzÂ -cÂ $*$(SUFFIX)

Then to test the parallel build against the serial build I ran each of the Hands On Example files found here. I found that I could run every example with no errors and that the parallel build runs faster than the serial build for every one of the included examples.

Good Luck!

#2 Post by **support_vasp** » Wed Sep 04, 2024 12:23 pm

Hi,

We're sorry that we didn’t answer your question. This does not live up to the quality of support that we aim to provide. The team has since expanded. If we can still help with your problem, please ask again in a new post, linking to this one, and we will answer as quickly as possible.

Best wishes,

VASP

My Community

Parallel Vasp successfully compiled (AMD x86_64, 4 core, OpenMPI, Blas, Intel-Fortran-Comp.)

Parallel Vasp successfully compiled (AMD x86_64, 4 core, OpenMPI, Blas, Intel-Fortran-Comp.)

Re: Parallel Vasp successfully compiled (AMD x86_64, 4 core, OpenMPI, Blas, Intel-Fortran-Comp.)