Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.
Moderators: Global Moderator, Moderator
-
richie_fong
- Newbie
- Posts: 3
- Joined: Fri Mar 18, 2022 4:43 pm
#1
Post
by richie_fong » Fri Sep 30, 2022 4:10 am
Hi! I have compiled VASP 6.3.2 with OpenACC + OpenMP using the makefile.include.nvhpc_ompi_mkl_omp_acc shown below, and ran the 'make test' successfully. However, when I tried to run my job script shown below on a HPC allocation of 4 MPI ranks (1 rank per GPU) + 12 OpenMP threads per rank on a node with 48 core (24 cores per socket) AMD Milan 7413 + 4x Nvidia A100, it showed an error message indicated below. The mpirun command I used is as shown below. Hope to receive some advice on this issue. Thank you!
Job script
Code: Select all
mpirun -np 4 --map-by ppr:2:socket:PE=12 --bind-to core \
-x OMP_NUM_THREADS=12 -x OMP_STACKSIZE=512m \
-x OMP_PLACES=cores -x OMP_PROC_BIND=close \
--report-bindings vasp_std
Output file
Code: Select all
----------------------------------------------------
OOO PPPP EEEEE N N M M PPPP
O O P P E NN N MM MM P P
O O PPPP EEEEE N N N M M M PPPP -- VERSION
O O P E N NN M M P
OOO P EEEEE N N M M P
----------------------------------------------------
running 4 mpi-ranks, with 12 threads/rank
distrk: each k-point on 1 cores, 4 groups
distr: one band on 1 cores, 1 groups
OpenACC runtime initialized ... 4 GPUs detected
vasp.6.3.2 27Jun22 (build Sep 28 2022 21:15:38) complex
POSCAR found type information on POSCAR LiMnNbO
POSCAR found : 4 types and 64 ions
Reading from existing POTCAR
scaLAPACK will be used selectively (only on CPU)
FATAL ERROR: data in update device clause was not found on device 4: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/std/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 3: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/std/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 1: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/std/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 2: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/std/fock.f90 xc_fock_reader line:567
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[9289,1],3]
Exit code: 1
--------------------------------------------------------------------------
Makefile.include
Code: Select all
# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxNV\" \
-DMPI -DMPI_BLOCK=8000 -Duse_collective \
-DscaLAPACK \
-DCACHE_SIZE=4000 \
-Davoidalloc \
-Dvasp6 \
-Duse_bse_te \
-Dtbdyn \
-Dqd_emulate \
-Dfock_dblbuf \
-D_OPENMP \
-D_OPENACC \
-DUSENCCL -DUSENCCLP2P
CPP = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX) > $*$(SUFFIX)
# N.B.: you might need to change the cuda-version here
# to one that comes with your NVIDIA-HPC SDK
FC = mpif90 -acc -gpu=cc60,cc70,cc80,cuda11.7 -mp
FCL = mpif90 -acc -gpu=cc60,cc70,cc80,cuda11.7 -mp -c++libs
FREE = -Mfree
FFLAGS = -Mbackslash -Mlarge_arrays
OFLAG = -fast
DEBUG = -Mfree -O0 -traceback
OBJECTS = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
LLIBS = -cudalib=cublas,cusolver,cufft,nccl -cuda
# Redefine the standard list of O1 and O2 objects
SOURCE_O1 := pade_fit.o
SOURCE_O2 := pead.o
# For what used to be vasp.5.lib
CPP_LIB = $(CPP)
FC_LIB = nvfortran
CC_LIB = nvc -w
CFLAGS_LIB = -O
FFLAGS_LIB = -O1 -Mfixed
FREE_LIB = $(FREE)
OBJECTS_LIB = linpack_double.o
# For the parser library
CXX_PARS = nvc++ --no_warnings
##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##
# When compiling on the target machine itself , change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU ?= -tp host
FFLAGS += $(VASP_TARGET_CPU)
# Specify your NV HPC-SDK installation (mandatory)
#... first try to set it automatically
NVROOT =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }')
# If the above fails, then NVROOT needs to be set manually
#NVHPC ?= /home/ljfong/VASP632/nvhpc
#NVVERSION = 22.7
#NVROOT = $(NVHPC)/Linux_x86_64/$(NVVERSION)
## Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2
OFLAG_IN = -fast -Mwarperf
SOURCE_IN := nonlr.o
# Software emulation of quadruple precsion (mandatory)
QD ?= $(NVROOT)/compilers/extras/qd
LLIBS += -L$(QD)/lib -lqdmod -lqd
INCS += -I$(QD)/include/qd
# Intel MKL for FFTW, BLAS, LAPACK, and scaLAPACK
MKLROOT ?= /cvmfs/soft.computecanada.ca/easybuild/software/2020/Core/imkl/2022.1.0
LLIBS_MKL = -Mmkl -L$(MKLROOT)/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64
INCS += -I$(MKLROOT)/include/fftw
# Use a separate scaLAPACK installation (optional but recommended in combination with OpenMPI)
# Comment out the two lines below if you want to use scaLAPACK from MKL instead
#SCALAPACK_ROOT ?= /path/to/your/scalapack/installation
#LLIBS_MKL = -L$(SCALAPACK_ROOT)/lib -lscalapack -Mmkl
LLIBS += $(LLIBS_MKL)
# HDF5-support (optional but strongly recommended)
CPP_OPTIONS+= -DVASP_HDF5
HDF5_ROOT ?= /home/ljfong/VASP632/hdf5
LLIBS += -L$(HDF5_ROOT)/lib -lhdf5_fortran
INCS += -I$(HDF5_ROOT)/include
# For the VASP-2-Wannier90 interface (optional)
#CPP_OPTIONS += -DVASP2WANNIER90
#WANNIER90_ROOT ?= /home/ljfong/VASP632/wannier/wannier90-3.1.0
#LLIBS += -L$(WANNIER90_ROOT)/lib -lwannier
# For the fftlib library (hardly any benefit for the OpenACC GPU port, especially in combination with MKL's FFTs)
#CPP_OPTIONS+= -Dsysv
#FCL += fftlib.o
#CXX_FFTLIB = nvc++ -mp --no_warnings -std=c++11 -DFFTLIB_USE_MKL -DFFTLIB_THREADSAFE
#INCS_FFTLIB = -I./include -I$(MKLROOT)/include/fftw
#LIBS += fftlib
#LLIBS += -ldl
-
martin.schlipf
- Global Moderator
- Posts: 542
- Joined: Fri Nov 08, 2019 7:18 am
#2
Post
by martin.schlipf » Fri Sep 30, 2022 7:46 am
Did you try your jobscript with one of the tests in the testsuite or on your own input?
If you have not done so already, please try to reproduce the failure on one of the tests in the testsuite. Please check the README of the testsuite (specifically section 2.2) on how to use your MPI options also to run the testsuite. If all the test in the testsuite run successfully even if you use your MPI options, then it would be something triggered by your specific input.
Martin Schlipf
VASP developer
-
martin.schlipf
- Global Moderator
- Posts: 542
- Joined: Fri Nov 08, 2019 7:18 am
#3
Post
by martin.schlipf » Fri Sep 30, 2022 8:13 am
One more thing you can try is suppressing the OpenMP parallelization by
.
If you cannot reproduce the failure with any test in the testsuite, please provide a complete set of input files, so that we can try to reproduce it locally.
Martin Schlipf
VASP developer
-
richie_fong
- Newbie
- Posts: 3
- Joined: Fri Mar 18, 2022 4:43 pm
#4
Post
by richie_fong » Fri Sep 30, 2022 3:18 pm
Thank you for the reply. As suggested, I tried both export OMP_NUM_THREADS=1 and on the testsuite. However, the same issue still occurs. I have attached the testsuite.log.
Code: Select all
Lmod is automatically replacing "intel/2020.1.217" with "nvhpc/22.7".
Lmod is automatically replacing "intel/2020.1.217" with "nvhpc/22.7".
==================================================================
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
VASP TESTSUITE SHA:
Reference files have been generated with 4 MPI ranks.
Note that tests might fail if an other number of ranks is used!
Executables and additional INCAR tags used for this test:
VASP_TESTSUITE_EXE_STD="mpirun -np 4 --map-by ppr:2:socket:PE=12 --bind-to core -x OMP_NUM_THREADS=1 -x OMP_STACKSIZE=512m -x OMP_PLACES=cores -x OMP_PROC_BIND=close /home/ljfong/VASP632/vasp.6.3.2/testsuite/../bin/vasp_std"
VASP_TESTSUITE_EXE_NCL="mpirun -np 4 --map-by ppr:2:socket:PE=12 --bind-to core -x OMP_NUM_THREADS=1 -x OMP_STACKSIZE=512m -x OMP_PLACES=cores -x OMP_PROC_BIND=close /home/ljfong/VASP632/vasp.6.3.2/testsuite/../bin/vasp_ncl"
VASP_TESTSUITE_EXE_GAM="mpirun -np 4 --map-by ppr:2:socket:PE=12 --bind-to core -x OMP_NUM_THREADS=1 -x OMP_STACKSIZE=512m -x OMP_PLACES=cores -x OMP_PROC_BIND=close /home/ljfong/VASP632/vasp.6.3.2/testsuite/../bin/vasp_gam"
VASP_TESTSUITE_INCAR_PREPEND=""
VASP_TESTSUITE_REFERENCE=""
VASP_TESTSUITE_SKIP_HYB=""
VASP_TESTSUITE_SKIP_NCL=""
VASP_TESTSUITE_SKIP_SOC=""
VASP_TESTSUITE_SKIP_MD=""
VASP_TESTSUITE_SKIP_TBMD=""
VASP_TESTSUITE_SKIP_RPA=""
VASP_TESTSUITE_SKIP_GW=""
VASP_TESTSUITE_SKIP_ACFDT=""
VASP_TESTSUITE_SKIP_CRPA=""
VASP_TESTSUITE_SKIP_BSE=""
VASP_TESTSUITE_SKIP_NOSYM=""
VASP_TESTSUITE_SKIP_VASP6=""
VASP_TESTSUITE_SKIP_GAMMA=""
VASP_TESTSUITE_SKIP_VASP45="Y"
VASP_TESTSUITE_SKIP_VASP46=""
VASP_TESTSUITE_SKIP_LREAL=""
VASP_TESTSUITE_SKIP_LRESP=""
VASP_TESTSUITE_SKIP_PEAD=""
VASP_TESTSUITE_SKIP_NCORE1=""
VASP_TESTSUITE_SKIP_WAN90=""
VASP_TESTSUITE_SKIP_KOPT=""
VASP_TESTSUITE_SKIP_ML=""
VASP_TESTSUITE_RUN_HYB=""
VASP_TESTSUITE_RUN_NCL=""
VASP_TESTSUITE_RUN_SOC=""
VASP_TESTSUITE_RUN_MD=""
VASP_TESTSUITE_RUN_TBMD=""
VASP_TESTSUITE_RUN_RPA=""
VASP_TESTSUITE_RUN_GW=""
VASP_TESTSUITE_RUN_ACFDT=""
VASP_TESTSUITE_RUN_CRPA=""
VASP_TESTSUITE_RUN_BSE=""
VASP_TESTSUITE_RUN_NOSYM=""
VASP_TESTSUITE_RUN_VASP6=""
VASP_TESTSUITE_RUN_GAMMA=""
VASP_TESTSUITE_RUN_LREAL=""
VASP_TESTSUITE_RUN_LRESP=""
VASP_TESTSUITE_RUN_PEAD=""
VASP_TESTSUITE_RUN_NCORE1=""
VASP_TESTSUITE_RUN_WAN90=""
VASP_TESTSUITE_RUN_KOPT=""
VASP_TESTSUITE_RUN_ML=""
VASP_TESTSUITE_RUN_FAST=""
Executed at: 10_22_09/30/22
==================================================================
------------------------------------------------------------------
CASE: andersen_nve
------------------------------------------------------------------
CASE: andersen_nve
entering run_recipe andersen_nve
andersen_nve step STD
------------------------------------------------------------------
andersen_nve step STD
entering run_vasp_g
----------------------------------------------------
OOO PPPP EEEEE N N M M PPPP
O O P P E NN N MM MM P P
O O PPPP EEEEE N N N M M M PPPP -- VERSION
O O P E N NN M M P
OOO P EEEEE N N M M P
----------------------------------------------------
running 4 mpi-ranks, with 1 threads/rank
distrk: each k-point on 2 cores, 2 groups
distr: one band on 1 cores, 2 groups
OpenACC runtime initialized ... 4 GPUs detected
vasp.6.3.2 27Jun22 (build Sep 28 2022 21:15:38) gamma-only
POSCAR found type information on POSCAR C H
POSCAR found : 2 types and 8 ions
Reading from existing POTCAR
scaLAPACK will be used selectively (only on CPU)
FATAL ERROR: data in update device clause was not found on device 4: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 1: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 2: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 3: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[22846,1],3]
Exit code: 1
--------------------------------------------------------------------------
exiting run_vasp_g
exiting run_recipe andersen_nve
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the energies, please check
-----------------------------------------------------------------------
-40.43139155
-40.43139155
-40.16490800
-40.41852499
-40.41852499
-40.22598800
-40.39746240
-40.39746240
-40.26328500
-40.38589715
-40.38589715
-40.20627900
-40.37920714
-40.37920714
-40.20310400
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files energy_outcar and
energy_outcar.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the forces, please check
---------------------------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files force and force.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the stress tensor is different, please check
---------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files stress and stress.ref
disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
CASE: andersen_nve_constrain
------------------------------------------------------------------
CASE: andersen_nve_constrain
entering run_recipe andersen_nve_constrain
andersen_nve_constrain step STD
------------------------------------------------------------------
andersen_nve_constrain step STD
entering run_vasp_g
----------------------------------------------------
OOO PPPP EEEEE N N M M PPPP
O O P P E NN N MM MM P P
O O PPPP EEEEE N N N M M M PPPP -- VERSION
O O P E N NN M M P
OOO P EEEEE N N M M P
----------------------------------------------------
running 4 mpi-ranks, with 1 threads/rank
distrk: each k-point on 2 cores, 2 groups
distr: one band on 1 cores, 2 groups
OpenACC runtime initialized ... 4 GPUs detected
vasp.6.3.2 27Jun22 (build Sep 28 2022 21:15:38) gamma-only
POSCAR found type information on POSCAR C H
POSCAR found : 2 types and 8 ions
Reading from existing POTCAR
scaLAPACK will be used selectively (only on CPU)
FATAL ERROR: data in update device clause was not found on device 1: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 4: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 2: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 3: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[22640,1],1]
Exit code: 1
--------------------------------------------------------------------------
exiting run_vasp_g
exiting run_recipe andersen_nve_constrain
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the energies, please check
-----------------------------------------------------------------------
-40.43139157
-40.43139157
-40.20288100
-40.42075479
-40.42075479
-40.26420400
-40.41036500
-40.41036500
-40.29119500
-40.40035358
-40.40035358
-40.23681400
-40.39177577
-40.39177577
-40.23577400
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files energy_outcar and
energy_outcar.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the forces, please check
---------------------------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files force and force.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the stress tensor is different, please check
---------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files stress and stress.ref
disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
CASE: andersen_nve_constrain_fixed
------------------------------------------------------------------
CASE: andersen_nve_constrain_fixed
entering run_recipe andersen_nve_constrain_fixed
andersen_nve_constrain_fixed step STD
------------------------------------------------------------------
andersen_nve_constrain_fixed step STD
entering run_vasp_g
----------------------------------------------------
OOO PPPP EEEEE N N M M PPPP
O O P P E NN N MM MM P P
O O PPPP EEEEE N N N M M M PPPP -- VERSION
O O P E N NN M M P
OOO P EEEEE N N M M P
----------------------------------------------------
running 4 mpi-ranks, with 1 threads/rank
distrk: each k-point on 2 cores, 2 groups
distr: one band on 1 cores, 2 groups
OpenACC runtime initialized ... 4 GPUs detected
vasp.6.3.2 27Jun22 (build Sep 28 2022 21:15:38) gamma-only
POSCAR found type information on POSCAR C H
POSCAR found : 2 types and 8 ions
Reading from existing POTCAR
scaLAPACK will be used selectively (only on CPU)
FATAL ERROR: data in update device clause was not found on device 1: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 2: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 4: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
FATAL ERROR: data in update device clause was not found on device 3: name=lexch
file:/home/ljfong/VASP632/vasp.6.3.2/build/gam/fock.f90 xc_fock_reader line:567
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[21019,1],3]
Exit code: 1
--------------------------------------------------------------------------
exiting run_vasp_g
exiting run_recipe andersen_nve_constrain_fixed
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the energies, please check
-----------------------------------------------------------------------
-40.43139157
-40.43139157
-40.23702800
-40.41628138
-40.41628138
-40.24759500
-40.39892731
-40.39892731
-40.23702500
-40.37944239
-40.37944239
-40.20187700
-40.35865076
-40.35865076
-40.19547000
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files energy_outcar and
energy_outcar.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the test yields different results for the forces, please check
---------------------------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files force and force.ref disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
Warning: ieee_inexact is signaling
FORTRAN STOP
ERROR: the stress tensor is different, please check
---------------------------------------------------
---------------------------------------------------------------------------
WARNING: Number of rows and/or columns in files stress and stress.ref
disagree.
Please check! Continuing using the smaller number of columns and/or rows.
---------------------------------------------------------------------------
-
martin.schlipf
- Global Moderator
- Posts: 542
- Joined: Fri Nov 08, 2019 7:18 am
#5
Post
by martin.schlipf » Mon Oct 03, 2022 9:39 am
Hmm, there doesn't seem to be an obvious flaw in your setup. Perhaps you can try to load the modules very carefully to avoid messages like
Code: Select all
Lmod is automatically replacing "intel/2020.1.217" with "nvhpc/22.7".
and make sure that the toolchain during compilation and execution are exactly the same.
Could you provide us with more information regarding you toolchain, i.e., which exact version of compiler, MPI and LAPACK/BLAS are you using?
Martin Schlipf
VASP developer
-
martin.schlipf
- Global Moderator
- Posts: 542
- Joined: Fri Nov 08, 2019 7:18 am
#6
Post
by martin.schlipf » Mon Oct 03, 2022 9:53 am
More ideas:
Can you use ldd vasp_std on your VASP executable and check if all the libraries are linked to the paths you expect?
Did you try to build without OpenMP support and does that influence whether you see the error?
Martin Schlipf
VASP developer
-
richie_fong
- Newbie
- Posts: 3
- Joined: Fri Mar 18, 2022 4:43 pm
#7
Post
by richie_fong » Thu Oct 20, 2022 3:20 pm
I tried to compile the VASP 6.3.2 openacc without openmp using makefile.include.nvhpc_acc but the same issue occured when using GPU, while the test suite with CPU works fine.
You do not have the required permissions to view the files attached to this post.
-
martin.schlipf
- Global Moderator
- Posts: 542
- Joined: Fri Nov 08, 2019 7:18 am
#8
Post
by martin.schlipf » Fri Oct 21, 2022 1:13 pm
Can you try to get a bit more basic?
module purge
module load nvhpc/22.7
module load fftw/3.3.10
and then try to recompile VASP with the makefile.include.nvhpc_acc with as little modifications as possible:
In particular, please do not use flexiblas and the
Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2 part in the file. You can also compile without HDF5 and Wannier90 support for now. You should also not need to explicitly scalapack if you use the -Mscalapack written to the default makefile.include. Finally, please double-check that your path to the fftw is correct and that it is compatible with nvfortran.
If you checked all of that then you do
to rebuild VASP. If that version still fails the tests, please copy the output of your terminal and attach it starting from the line, where you type
module purge.
Martin Schlipf
VASP developer