Can not successfully compile VASP in GPU

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Message
Author
bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Can not successfully compile VASP in GPU

#1 Post by bhargabkakati » Sun Apr 28, 2024 7:03 am

Dear experts,

I was trying to compile vasp and encountered some error. I tried to compile it in two different modes ( MPI + OpenMP and OpenMPI + OpenMP + MKL) but none of them were successful. Here I have attached my log file and the makefile.include for each and also the .bashrc of my system. If you could look at them and share some insight about the error that would be really great.

Thank You.
You do not have the required permissions to view the files attached to this post.

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#2 Post by bhargabkakati » Mon Apr 29, 2024 6:10 am

hello, my system configuration is :

OS: Ubuntu 22.04
CPU: 36 Core
GPU: Nvidia RTX A6000
CUDA Version: 12.3

nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Sep__8_19:17:24_PDT_2023
Cuda compilation tools, release 12.3, V12.3.52
Build cuda_12.3.r12.3/compiler.33281558_0

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#3 Post by fabien_tran1 » Mon Apr 29, 2024 8:36 am

Hi,

Compared to the makefiles provided in the arch directory, your makefiles were modified. In particular, the -mp flag (for FC and FCL) that enables OpenMP is missing (while -D_OPENMP is present). Thus, for the moment the suggestion is to add -mp to FC and FCL and recompile (but execute "make veryclean" before). Please, let me know if this helps.

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#4 Post by bhargabkakati » Mon Apr 29, 2024 10:33 am

Hello,

Than you for pointing out that mistake. As per your suggestion adding -mp to FC and FCL fixed the error, but now I am getting different errors. At first, I thought the wannier90 link is causing the error, so I tried to compile it without wannier90 but this time I got different errors for both MPI+OpenMP and OpenMPI+OpenMP+MKL. I have attached the error for each compilation (with w90 and without w90).

Thank you
You do not have the required permissions to view the files attached to this post.

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#5 Post by fabien_tran1 » Mon Apr 29, 2024 11:46 am

Considering first the compilation without MKL (error.nvhpc_omp_acc.without.w90.txt), the crash is due to a problem related to the FFT library. Did you solve the problem with the FFT that you recently had ("VASP executable is still linked to the gnu versions"): https://www.vasp.at/forum/viewtopic.php ... =15#p26115

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#6 Post by bhargabkakati » Mon Apr 29, 2024 12:24 pm

Hello,

No, I was not able to solve it. I wiped my hard drive after that and booted ubuntu there and started a fresh compilation. I was able to compile vasp with "makefile.include.nvhpc_acc" this makefile but I thought it would be better to compile it with OpenMP support. Here I have attached output of "ldd vasp_ncl" and also the makefile.include of that successful compilation. I don't actually have any idea about solving the fftw issue. I have sourced the mkl libraries in my bashrc (export PATH=/opt/intel/oneapi/mkl/2024.0/include/fftw:$PATH) and also I have installed fftw separately too (export LD_LIBRARY_PATH=/home/cmsgpu/softwares/fftw-install/lib:$LD_LIBRARY_PATH).
You do not have the required permissions to view the files attached to this post.

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#7 Post by bhargabkakati » Mon Apr 29, 2024 1:37 pm

hello,

I was able to compile the MPI+OpenMP mode successfully (but still no luck with OpenMPI+OpenMP+mkl). I noticed -lfftw3_omp was not added to the LLIBS and adding it solved the issue but I am not able to do test run with these executables.

when I do mpirun -np 1 /home/cmsgpu/softwares/vasp.6.4.2-mpi_omp/bin/vasp_ncl i get the following error:
running 1 mpi-ranks, with 1 threads/rank, on 1 nodes
distrk: each k-point on 1 cores, 1 groups

libgomp: TODO
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

Process name: [[26727,1],0]
Exit code: 1

and this is the output of "ldd vasp_ncl" :

linux-vdso.so.1 (0x00007fff10dea000)
libqdmod.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/extras/qd/lib/libqdmod.so.0 (0x000071a689000000)
libqd.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/extras/qd/lib/libqd.so.0 (0x000071a688c00000)
liblapack_lp64.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/liblapack_lp64.so.0 (0x000071a687e00000)
libblas_lp64.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libblas_lp64.so.0 (0x000071a685e00000)
libfftw3.so.3 => /lib/x86_64-linux-gnu/libfftw3.so.3 (0x000071a685a00000)
libfftw3_omp.so.3 => /lib/x86_64-linux-gnu/libfftw3_omp.so.3 (0x000071a6892af000)
libhdf5_fortran.so.310 => /home/cmsgpu/softwares/HDF5_nvc_compiler/myhdfstuff/build/HDF_Group/HDF5/1.14.3/lib/libhdf5_fortran.so.310 (0x000071a689261000)
libmpi_usempif08.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libmpi_usempif08.so.40 (0x000071a685600000)
libmpi_usempi_ignore_tkr.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libmpi_usempi_ignore_tkr.so.40 (0x000071a685200000)
libmpi_mpifh.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libmpi_mpifh.so.40 (0x000071a684e00000)
libmpi.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libmpi.so.40 (0x000071a684800000)
libscalapack_lp64.so.2 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libscalapack_lp64.so.2 (0x000071a684000000)
libnvhpcwrapcufft.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvhpcwrapcufft.so (0x000071a683c00000)
libcufft.so.11 => /usr/local/cuda-12.3/lib64/libcufft.so.11 (0x000071a678e00000)
libcusolver.so.11 => /usr/local/cuda-12.3/lib64/libcusolver.so.11 (0x000071a671c00000)
libcudaforwrapnccl.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudaforwrapnccl.so (0x000071a671800000)
libnccl.so.2 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/nccl/lib/libnccl.so.2 (0x000071a660800000)
libcublas.so.12 => /usr/local/cuda-12.3/lib64/libcublas.so.12 (0x000071a65a000000)
libcublasLt.so.12 => /usr/local/cuda-12.3/lib64/libcublasLt.so.12 (0x000071a637000000)
libcudaforwrapblas.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudaforwrapblas.so (0x000071a636c00000)
libcudaforwrapblas117.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudaforwrapblas117.so (0x000071a636800000)
libcudart.so.12 => /usr/local/cuda-12.3/lib64/libcudart.so.12 (0x000071a636400000)
libcudafor_120.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudafor_120.so (0x000071a630400000)
libcudafor.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudafor.so (0x000071a630000000)
libacchost.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libacchost.so (0x000071a62fc00000)
libaccdevaux.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libaccdevaux.so (0x000071a62f800000)
libacccuda.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libacccuda.so (0x000071a62f400000)
libcudadevice.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudadevice.so (0x000071a62f000000)
libcudafor2.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libcudafor2.so (0x000071a62ec00000)
libnvf.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvf.so (0x000071a62e400000)
libnvhpcatm.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvhpcatm.so (0x000071a62e000000)
libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x000071a62dc00000)
libnvomp.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvomp.so (0x000071a62ca00000)
libnvcpumath.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvcpumath.so (0x000071a62c400000)
libnvc.so => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/compilers/lib/libnvc.so (0x000071a62c000000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x000071a62bc00000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x000071a689237000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x000071a688f19000)
libatomic.so.1 => /lib/x86_64-linux-gnu/libatomic.so.1 (0x000071a68922b000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x000071a689226000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x000071a688f14000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x000071a688f0f000)
libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 (0x000071a688ec5000)
libhdf5_f90cstub.so.310 => /home/cmsgpu/softwares/HDF5_nvc_compiler/myhdfstuff/build/HDF_Group/HDF5/1.14.3/lib/libhdf5_f90cstub.so.310 (0x000071a688ea3000)
libhdf5.so.310 => /home/cmsgpu/softwares/HDF5_nvc_compiler/myhdfstuff/build/HDF_Group/HDF5/1.14.3/lib/libhdf5.so.310 (0x000071a62b400000)
libopen-rte.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libopen-rte.so.40 (0x000071a62b000000)
libopen-pal.so.40 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libopen-pal.so.40 (0x000071a62aa00000)
libucp.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libucp.so.0 (0x000071a62a600000)
libuct.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libuct.so.0 (0x000071a62a200000)
libucs.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libucs.so.0 (0x000071a629e00000)
libnuma.so.1 => /lib/x86_64-linux-gnu/libnuma.so.1 (0x000071a688e94000)
libucm.so.0 => /opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/lib/libucm.so.0 (0x000071a629a00000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x000071a688e8d000)
libz.so.1 => /usr/local/lib/libz.so.1 (0x000071a688e68000)
/lib64/ld-linux-x86-64.so.2 (0x000071a6892d1000)
libnvJitLink.so.12 => /usr/local/cuda-12.3/lib64/libnvJitLink.so.12 (0x000071a626400000)
libcusparse.so.12 => /usr/local/cuda-12.3/lib64/libcusparse.so.12 (0x000071a616400000)

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#8 Post by fabien_tran1 » Mon Apr 29, 2024 1:51 pm

What does the command "which mpirun" return?

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#9 Post by bhargabkakati » Mon Apr 29, 2024 1:53 pm

it returns:
/opt/nvidia/hpc_sdk/Linux_x86_64/24.3/comm_libs/12.3/openmpi4/openmpi-4.1.5/bin/mpirun

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#10 Post by fabien_tran1 » Mon Apr 29, 2024 3:01 pm

I will ask for help from colleagues. Meanwhile you could consider watching the video that is mentioned in another topic (https://www.vasp.at/forum/viewtopic.php?t=19472).

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#11 Post by bhargabkakati » Mon Apr 29, 2024 3:06 pm

thank you for the suggestion but I compiled VASP watching that video only. But not able to compile successfully unlike in the video

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#12 Post by fabien_tran1 » Mon Apr 29, 2024 3:26 pm

Are you sure that there is no mistake in the paths that are in your makefile.include? For instance, do the directories
/opt/intel/oneapi/mkl/2024.1/lib/
/opt/intel/oneapi/mkl/2024.1/include/
really exist?

Another question: is there an environment module (https://modules.readthedocs.io/en/latest/#) installed on your machines?

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#13 Post by bhargabkakati » Mon Apr 29, 2024 3:52 pm

hello,

Yes, those paths do exist and there is no environment module installed. I installed all the things mentioned in ( https://implant.fs.cvut.cz/vasp-gpu-compilation/ ).

bhargabkakati
Newbie
Newbie
Posts: 39
Joined: Mon May 29, 2023 8:56 am

Re: Can not successfully compile VASP in GPU

#14 Post by bhargabkakati » Mon Apr 29, 2024 3:55 pm

I'd also like to say that mpirun works fine on quantum espresso, wannier90 and VAMPIRE and also on the makefile.include.nvhpc_acc version of VASP.

fabien_tran1
Global Moderator
Global Moderator
Posts: 417
Joined: Mon Sep 13, 2021 11:02 am

Re: Can not successfully compile VASP in GPU

#15 Post by fabien_tran1 » Mon Apr 29, 2024 4:03 pm

Do you have by chance access to an older version of the Nvidia compiler (i.e., older than 24.3)?

Post Reply