system:
- 4 x AMD Opteron(tm) Processor 850
- Arch Linux (X86-64)
- Intel Fortran Compiler 11.1
- Gnu C/C++ Compiler 4.5.0
- OpenMPI 1.4.1
- Vasp 5.2
steps:
- install the Intel Fortran Compiler (ifort)
- install OpenMPI
- build the fftw 3.x Fortran wrapper library
- install the Blas and Lapack libraries
- build the VASP libraries
- build VASP
ifort
The Intel Fortran Compiler works well but requires lib32-gcc even though I am compiling the 64 bit version. Without it the installation fails without giving any error messages, so it took me a while to figure out the problem.
OpenMPI
OpenMPI must be compiled with ifort as the fortran compiler because gfortran, and g95 both seem unable to compile VASP as of now. The process will be as easy as a simple substitution of 'ifort' for 'gfortran' or 'g95' in the make file. to keep my system organized I used the PKGBUILD available in the Arch Linux User Repository (AUR) and changed gfortran to ifort.
fftw 3.x
After many attempts at compiling VASP I stumbled onto this article on Intel's website. I followed all of their advice except for one slight deviation. I do not have the Intel C compiler so I used the Gnu Compiler.
Code: Select all
make libem64t compiler=gnu
Blas and Lapack
Not much for me to do here, I just used the Blas and Lapack libraries from the regular Arch Linux repositories.
VASP Libraries
I edited the FC line of the makefile.linux_efc_itanium file slightly to use open mpi in conjunction with ifort
Code: Select all
.SUFFIXES: .inc .f .F
CPP     = gcc -E -P -C $*.F >$*.f
FC=mpiifort
CFLAGSÂ =Â -O
FFLAGSÂ =Â -O1Â -FI
FREEÂ Â Â =Â Â -FR
DOBJ =  preclib.o timing_.o derrf_.o dclock_.o  diolib.o dlexlib.o drdatab.o
#-----------------------------------------------------------------------
# general rules
#-----------------------------------------------------------------------
libdmy.a: $(DOBJ) lapack_double.o linpack_double.o lapack_atlas.o
-rm libdmy.a
ar vq libdmy.a $(DOBJ)
# files which do not require autodoubleÂ
lapack_min.o:Â lapack_min.f
$(FC) $(FFLAGS) $(NOFREE) -c lapack_min.f
lapack_double.o:Â lapack_double.f
$(FC) $(FFLAGS) $(NOFREE) -c lapack_double.f
lapack_single.o:Â lapack_single.f
$(FC) $(FFLAGS) $(NOFREE) -c lapack_single.f
lapack_atlas.o:Â lapack_atlas.f
$(FC) $(FFLAGS) $(NOFREE) -c lapack_atlas.f
linpack_double.o:Â linpack_double.f
$(FC) $(FFLAGS) $(NOFREE) -c linpack_double.f
linpack_single.o:Â linpack_single.f
$(FC) $(FFLAGS) $(NOFREE) -c linpack_single.f
.c.o:
$(CC) $(CFLAGS) -c $*.c
.F.o:
$(CPP)Â
$(FC) $(FFLAGS) $(FREE) $(INCS) -c $*.f
.F.f:
$(CPP)Â
.f.o:
$(FC) $(FFLAGS) $(FREE) $(INCS) -c $*.f
I once again edited the makefile.linux_efc_itanium file to suit my needs. The changes I made were
- set the fortran compiler to mpiifort
- change the fortran flags line to match that on Intel's website as mentioned above
- add -heap-arrays to FFLAGS to avoid segfaults (as per this forum post)
- change the BLAS, LAPACK and FFT3D lines to search the proper location for the libraries they need
- uncomment a few lines in the mpi section
Code: Select all
.SUFFIXES: .inc .f .f90 .F
SUFFIX=.f90
FFLAGS =  -I/opt/intel/Compiler/11.1/072/mkl/include/fftw -FR -lowercase -assume byterecl -ftz -heap-arrays
#-----------------------------------------------------------------------
#Â optimization
# -O3 seems best
#-----------------------------------------------------------------------
OFLAG=-O3
OFLAG_HIGHÂ =Â $(OFLAG)
OBJ_HIGHÂ =Â
OBJ_NOOPTÂ =Â
DEBUGÂ Â =Â -FRÂ -O0
INLINEÂ =Â $(OFLAG)
#-----------------------------------------------------------------------
# the following lines specify the position of BLAS  and LAPACK
#-----------------------------------------------------------------------
BLAS= -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t/ -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread
# use the mkl Intel lapack
LAPACK= -L/opt/intel/Compiler/11.1/072/mkl/lib/em64t/ -lmkl_lapack -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread
#-----------------------------------------------------------------------
LIB  = -L../vasp.5.lib -ldmy \
     ../vasp.5.lib/linpack_double.o $(LAPACK) \
     $(BLAS)
# options for linking (for compiler version 6.X) nothing is required
LINKÂ Â Â Â =Â Â
#=======================================================================
#Â MPIÂ section
#
# the system we used is an SGI test system, and it is best
# to compile using ifort and adding the option -lmpi during
#Â linking
#=======================================================================
FC=mpiifort
FCL=$(FC)
#-----------------------------------------------------------------------
# additional options for CPP in parallel version (see also above):
# NGZhalf               charge density   reduced in Z direction
# wNGZhalf              gamma point only reduced in Z direction
# scaLAPACK             use scaLAPACK (usually slower on 100 Mbit Net)
#-----------------------------------------------------------------------
CPPÂ Â Â Â =Â $(CPP_)Â -DMPIÂ Â -DHOST=\"LinuxIFCmkl\"Â -DIFCÂ \
     -Dkind8 -DCACHE_SIZE=4000 -DPGF90 -Davoidalloc \
     -DMPI_BLOCK=8000 \
     -DRPROMU_DGEMV  -DRACCMU_DGEMV
SCA=
#-----------------------------------------------------------------------
# libraries for mpi
#-----------------------------------------------------------------------
LIB     = -L../vasp.5.lib -ldmy  \
      ../vasp.5.lib/linpack_double.o $(LAPACK) \
      $(SCA) $(BLAS) \
      -lmpi
FFT3D   = fftmpi.o fftmpi_map.o fftw3d.o   fft3dlib.o  /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libfftw3xf_gnu.a
#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC=   symmetry.o symlib.o   lattlib.o  random.o  Â
SOURCE=  base.o     mpi.o      smart_allocate.o      xml.o  \
         constant.o jacobi.o   main_mpi.o  scala.o   \
         asa.o      lattice.o  poscar.o   ini.o       xclib.o     xclib_grad.o \
         radial.o   pseudo.o   mgrid.o    gridq.o     ebs.o  \
         mkpoints.o wave.o     wave_mpi.o  wave_high.o  \
         $(BASIC)   nonl.o     nonlr.o    nonl_high.o dfast.o    choleski2.o \
         mix.o      hamil.o    xcgrad.o   xcspin.o    potex1.o   potex2.o  \
         metagga.o constrmag.o cl_shift.o relativistic.o LDApU.o \
         paw_base.o egrad.o    pawsym.o   pawfock.o  pawlhf.o    paw.o   \
         mkpoints_full.o       charge.o   dipol.o    pot.o  \
         dos.o      elf.o      tet.o      tetweight.o hamil_rot.o \
         steep.o    chain.o    dyna.o     sphpro.o    us.o  core_rel.o \
         aedens.o   wavpre.o   wavpre_noio.o broyden.o \
         dynbr.o    rmm-diis.o reader.o   writer.o   tutor.o xml_writer.o \
         brent.o    stufak.o   fileio.o   opergrid.o stepver.o  \
         chgloc.o   fast_aug.o fock.o     mkpoints_change.o sym_grad.o \
         mymath.o   internals.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \
         hamil_high.o nmr.o    force.o \
         pead.o     subrot.o   subrot_scf.o pwlhf.o  gw_model.o optreal.o   davidson.o \
         electron.o rot.o  electron_all.o shm.o    pardens.o  paircorrection.o \
         optics.o   constr_cell_relax.o   stm.o    finite_diff.o elpol.o    \
         hamil_lr.o rmm-diis_lr.o  subrot_cluster.o subrot_lr.o \
         lr_helper.o hamil_lrf.o   elinear_response.o ilinear_response.o \
         linear_optics.o linear_response.o   \
         setlocalpp.o  wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \
         ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o local_field.o \
         ump2.o bse.o acfdt.o chi.o sydmat.oÂ
INC=
vasp:Â $(SOURCE)Â $(FFT3D)Â $(INC)Â main.oÂ
rm -f vasp
$(FCL) -o vasp main.o  $(SOURCE)   $(FFT3D) $(LIB) $(LINK)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
$(FCL) -o makeparam  $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)Â
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)
clean:
-rm -f *.g *.f *.o *.L *.mod ; touch *.F
main.o:Â main$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c main$(SUFFIX)
xcgrad.o:Â xcgrad$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcgrad$(SUFFIX)
xcspin.o:Â xcspin$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcspin$(SUFFIX)
makeparam.o:Â makeparam$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c makeparam$(SUFFIX)
makeparam$(SUFFIX):Â makeparam.FÂ main.FÂ
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.inc wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F
$(OBJ_HIGH):
$(CPP)
$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
$(CPP)
$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)
fft3dlib_f77.o:Â fft3dlib_f77.F
$(CPP)
$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)
.F.o:
$(CPP)
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
$(CPP)
$(SUFFIX).o:
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
# special rules
#-----------------------------------------------------------------------
# these special rules are cummulative (that is once failed
#   in one compiler version, stays in the list forever)
# performance penalities are small however
fft3dlib.o : fft3dlib.F
$(CPP)
$(FC) -FR -lowercase -O3 -ip -ftz -c $*$(SUFFIX)
fft3dfurth.o : fft3dfurth.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
radial.o : radial.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
rot.o : rot.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
symlib.o : symlib.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
acfdt.o : acfdt.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
chi.o : chi.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
poscar.o : poscar.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
chi_base.o : chi_base.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
symmetry.o : symmetry.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
pead.o : pead.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
dynbr.o : dynbr.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
electron_all.o : electron_all.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
asa.o : asa.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
broyden.o : broyden.F
$(CPP)
$(FC) -FR -lowercase -O2 -ftz -c $*$(SUFFIX)
us.o : us.F
$(CPP)
$(FC) -FR -lowercase -O1 -ftz -c $*$(SUFFIX)
LDApU.o : LDApU.F
$(CPP)
$(FC) -FR -lowercase -O2 -ftz -c $*$(SUFFIX)
Good Luck!