My Community

Posted: **Tue Dec 14, 2010 3:47 pm**

Dear All,
Prof. WA Hofer provided the following Makefile for us to use on SciNet.
The IBM mpxlf90 fortran compiler with MPI wrappers requires several modules be de-optimized (see end of makefile).

For posterity, here is the Makefile.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
.SUFFIXES: .inc .f .F
#-----------------------------------------------------------------------
# Makefile for SciNet (IBM P6 cluster with IBM xlf90 compiler)
# using mpxlf90 wrapper for MP interface
# edited by WA Hofer from
# Makefile for IBM p690 (HLRN) parallel
# $Id: makefile.hlrn,v 1.2 2005/11/04 14:42:40 bzfbbk Exp $
# supplied by Bernd Kallies <kallies@zib.de>
#
# NB See special compilation rules for individual modules at bottom
# of the Makefile. These set by WA Hofer to simulate INTEL
# compiler options (VASP5 works under INTEL, needs special treatments
# to work with mpxlf90).
#
#-----------------------------------------------------------------------

# all CPP processed fortran files have the extension .f
SUFFIX=.f

#-----------------------------------------------------------------------
# fortran compiler and linker
#-----------------------------------------------------------------------
FC = mpxlf90 -qfree=f90
F77 = mpxlf
FCL = $(FC)

#-----------------------------------------------------------------------
# C-preprocessor define any of the flags given below
# MPI generate parallel version
# NGZhalf charge density reduced in X direction
# wNGZhalf gamma point only reduced in X direction
# CACHE_SIZE 5001 for SP3 and Power 3
# 32768 for 550,590,3CT
# 8001 595/397 quad word systems
# scaLAPACK use scaLAPACK
#-----------------------------------------------------------------------
CPP = /usr/ccs/lib/cpp -P -DHOST=\"SP2/3/4\" -DMPI -DUSE_ZHEEVX -DCACHE_SIZE=5001 -Dessl \
-Dkind8 -DCACHE_SIZE=12000 -DPGF90 -Davoidalloc -DNGZhalf \
$*.F >$*$(SUFFIX)

#-----------------------------------------------------------------------
# general fortran flags, none required
#-----------------------------------------------------------------------
FFLAGS = -qmaxmem=-1 -qarch=auto -qtune=auto -qcache=auto -qinitauto -qcheck -qsave=all
#FFLAGS = -qmaxmem=-1 -qarch=auto -qtune=auto -qcache=auto -qinitauto
#FFLAGS = -qmaxmem=-1 -qarch=auto -qtune=auto -qcache=auto -qinitauto -qsave=all

#-----------------------------------------------------------------------
# optimization:
# optimise for the machine on which the code is compiled
#-----------------------------------------------------------------------
OFLAG_0 = -O0
OFLAG_1 = -O1
OFLAG_2 = -O2 -qstrict
OFLAG_3 = -O3 -qarch=auto -qstrict
OFLAG_4 = -O4 -qstrict -qhot
DEBUG = -g -qfullpath
INCS =
OFLAG = $(OFLAG_3)
#OFLAG = $(DEBUG)
INLINE = $(OFLAG)

pseudo.o: OFLAG = $(OFLAG_0)
paw.o: OFLAG = $(OFLAG_1)

#-----------------------------------------------------------------------
# options for linking
# the following option increases the size of the data frame
#-----------------------------------------------------------------------
# see if removing the next line helps execution
#LINK = -bmaxdata:0x80000000 -bmaxstack:0x10000000
#LIB = -Lvasp.5.lib -ldmy -lessl -L/aws/numerics/lapack -llapack
# add next line from vasp.4.6 Makefile
# WA Hofer prefers to use "static" linking - gives a more stable build
LINK = -q64 -static -bnoquiet
LIB = -L../vasp.5.lib -ldmy ../vasp.5.lib/linpack_double.o -lessl -L/usr/local/lib -llapack
#LIB = -L../vasp.5.lib -ldmy ../vasp.5.lib/linpack_double.o -lessl

#-----------------------------------------------------------------------
# specify 3d-fft to be used with VASP
# fft3dessl is usually fastest on the IBM, however fft3dfurth comes
# very close and faster for 2^n
#-----------------------------------------------------------------------
FFT3D = fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o
!FFT3D = fftmpi.o fftmpi_map.o fftw3d.o fft3dlib.o

#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC= symmetry.o symlib.o lattlib.o random.o

SOURCE= base.o mpi.o smart_allocate.o xml.o \
constant.o jacobi.o main_mpi.o scala.o \
asa.o lattice.o poscar.o ini.o xclib.o xclib_grad.o \
radial.o pseudo.o mgrid.o gridq.o ebs.o \
mkpoints.o wave.o wave_mpi.o wave_high.o \
$(BASIC) nonl.o nonlr.o nonl_high.o dfast.o choleski2.o \
mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o \
constrmag.o cl_shift.o relativistic.o LDApU.o \
paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o rhfatm.o paw.o \
mkpoints_full.o charge.o Lebedev-Laikov.o stockholder.o dipol.o pot.o \
dos.o elf.o tet.o tetweight.o hamil_rot.o \
steep.o chain.o dyna.o sphpro.o us.o core_rel.o \
aedens.o wavpre.o wavpre_noio.o broyden.o \
dynbr.o rmm-diis.o reader.o writer.o tutor.o xml_writer.o \
brent.o stufak.o fileio.o opergrid.o stepver.o \
chgloc.o fast_aug.o fock.o mkpoints_change.o sym_grad.o \
mymath.o optengines.o internals.o hessian.o gadget.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \
hamil_high.o nmr.o force.o \
pead.o mlwf.o subrot.o subrot_scf.o pwlhf.o gw_model.o optreal.o davidson.o david_inner.o \
electron.o rot.o electron_all.o shm.o pardens.o paircorrection.o \
optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o \
hamil_lr.o rmm-diis_lr.o subrot_cluster.o subrot_lr.o \
lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o \
linear_optics.o linear_response.o \
setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \
ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o local_field.o \
ump2.o bse.o acfdt.o chi.o sydmat.o

INC=

vasp: $(SOURCE) $(FFT3D) $(INC) main.o
rm -f vasp
$(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
$(FCL) -o makeparam $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:
-rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
$(FC) $(FFLAGS) $(DEBUG) $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
$(FC) $(FFLAGS) $(DEBUG) $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.inc wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
$(CPP)
$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
$(CPP)
$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
$(CPP)
$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
$(CPP)
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
$(CPP)
$(SUFFIX).o:
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

# special rules
#-----------------------------------------------------------------------
fft3dfurth.o : fft3dfurth.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

fftw3d.o : fftw3d.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

#radial.o : radial.F
# $(CPP)
# $(FC) -O1 -c $*$(SUFFIX)

symlib.o : symlib.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

symmetry.o : symmetry.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

dynbr.o : dynbr.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

broyden.o : broyden.F
$(CPP)
$(FC) -O2 -c $*$(SUFFIX)

us.o : us.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

wave.o : wave.F
$(CPP)
$(FC) -O0 -c $*$(SUFFIX)

LDApU.o : LDApU.F
$(CPP)
$(FC) -O2 -c $*$(SUFFIX)

fftmpi_map.o : fftmpi_map.F
$(CPP)
$(FC) -O1 -c $*$(SUFFIX)

dos.o : dos.F
$(CPP)
$(FC) -O2 -qstrict -c $*$(SUFFIX)

electron.o: electron.F
$(CPP)
$(FC) -O2 -qstrict -c $*$(SUFFIX)

paw.o: paw.F
$(CPP)
$(FC) -O2 -qstrict -c $*$(SUFFIX)

++++++++++++++++++++++++++++++++++++++++++++++++++++++
<span class='smallblacktext'>[ Edited ]</span>

Posted: **Mon Dec 27, 2010 1:26 pm**

please download the latest version of vasp.5.2.11 (version dated to Dec 23rd, 2010) from our server or the vasp5-portal, the most recent bug fixes (of bugs that do not show up with the Intel compiler) are included in this version, which was tested on IBM using mpxlf90. Please have a look at the News posted on the vasp5-portal for further information

Posted: **Sat Jan 08, 2011 6:34 pm**

I wonder if anyone has successfully used vasp5.2.X with large number of cores (256 or 512) with scalapack on AIX power6?
I have 5.2.11 compiled with scalapack but it can not be used with large number of cores.

If anyone have tip, or have successfully compiled with Scalapack, please tip me with the makefile.

Thank you.

My Community

Makefile for vasp.5.2.11 build with IBM mpxlf90 compiler for IBM Power 6 cluster AIX os

Makefile for vasp.5.2.11 build with IBM mpxlf90 compiler for IBM Power 6 cluster AIX os

Makefile for vasp.5.2.11 build with IBM mpxlf90 compiler for IBM Power 6 cluster AIX os

Makefile for vasp.5.2.11 build with IBM mpxlf90 compiler for IBM Power 6 cluster AIX os