vasp not detecting GPU

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Locked
Message
Author
bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

vasp not detecting GPU

#1 Post by bratinag » Wed Apr 17, 2024 12:31 pm

Hi,
I'm trying to setup VASP on a HPC facility with multiple nodes, each having 8 GPUs (Tesla M60). Due to the availability of the NVHPC modules (CUDA 11.7) the only possible version, which I am able to compile without errors is 6.3.0.

Running VASP using a slurm script (attached in a zip file) initially produces a series of UCX warnings (visible in the initial lines of the attached slurm-301066.out file. In the following the code seems to run uneventfully, albeit very slow. In addition, no message about the presence of GPU is published in the first few output lines. This leads me to believe that the GPUs are not detected by VASP.

The same VASP input set of files (INCAR, POSCAR...) finishes fine and reasonably fast on a local machine equipped with a single RTX 4090 GPU and running vasp 6.4.3.

I have also included a makefile.include, made from a template makefile.include.nvhpc_acc. I have also tried to use a template makefile.include.nvhpc_omp_ac with the same results.

Regards.

Gvido

michael_wolloch
Global Moderator
Global Moderator
Posts: 109
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp not detecting GPU

#2 Post by michael_wolloch » Wed Apr 17, 2024 12:44 pm

Dear Gvido,

you seem to have forgotten to attach your attachments...
What version of NVIDIA HPC SDK are you using? You can always download the newest version directly from NVIDIA, which would maybe help you to get the most recent version (6.4.3) running.

Cheers, Michael

bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

Re: vasp not detecting GPU

#3 Post by bratinag » Thu Apr 18, 2024 6:38 am

Ah, sorry, fingers faster than the brain. Attachment included.

The HPC facility uses NVHPC/22.7-CUDA-11.7.0. Upgrading to the latest version is beyond my permissions (everything runs in modules). I'll talk to their system administrator.

Regrads.

Gvido
You do not have the required permissions to view the files attached to this post.

michael_wolloch
Global Moderator
Global Moderator
Posts: 109
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp not detecting GPU

#4 Post by michael_wolloch » Thu Apr 18, 2024 9:07 am

Dear Gvido,

The good news is that running HPV-SDK 22.7 (>21.2) and CUDA 11.7 (>10.0) are ok for the openACC port of VASP. You should be able to compile the current version (6.4.3).

The bad news is that your GPUs seem to be a bit slow for double precision (FP64), which is what is important for VASP. According to a link I found single precision performance is pretty high for a ~10-year-old card at 4.8 TFLOPS, but double precision is only 1/32 of that, at 150 GFLOPS. Of course, FP64 performance should be lower than FP32, but ideally only by a factor of 2, like for the Tesla P100 which is only 1 year younger than the M60.

You should still be able to compile the code for the M60s however, and check if you get any speedup compared to running on CPU only.

Note that the compute capability of the M60 is only 5.2, so your makefile.include has to be adapted to generate code for that compute ability:

Code: Select all

CC          = mpicc  -acc -gpu=cc52,cuda11.7
FC          = mpif90 -acc -gpu=cc52,cuda11.7
FCL         = mpif90 -acc -gpu=cc52,cuda11.7 -c++libs
Note that I based these lines on the makefile.include.nvhpc_acc of version 6.4.1, which I think you should be able to build. This is why, compared to your makefile.include, there is the CC line there as well. Note that you can also build for more than one compute capability (cc) at a time, so there is the option to keep the cc60,cc70,cc80 as well, but the M60 needs the cc52 for sure.

If you are willing to spend time on something that probably will show very weak performance, try to compile 6.4.3 with the appropriately modified makefile.include taken from the arch folder and get back to me if you still have troubles.

Cheers, Michael

bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

Re: vasp not detecting GPU

#5 Post by bratinag » Mon Apr 22, 2024 8:10 am

Thanks, Michael.

I'll get on to it today, and let you know.

Regards.

Gvido

michael_wolloch
Global Moderator
Global Moderator
Posts: 109
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp not detecting GPU

#6 Post by michael_wolloch » Mon Apr 22, 2024 9:15 am

Dear Gvido,

please also get back to me if it works with the new makefile.include setting, so I can close the thread.

Good luck, Michael

bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

Re: vasp not detecting GPU

#7 Post by bratinag » Wed Apr 24, 2024 12:14 pm

Hi Michael,

the compiler complains about cc:

Compile for compute capability X.Y; supported values: 35,50,60,61,62,70,72,75,80,86.

What do you suggest to take? 50?.

michael_wolloch
Global Moderator
Global Moderator
Posts: 109
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp not detecting GPU

#8 Post by michael_wolloch » Wed Apr 24, 2024 12:23 pm

Dear Gvido,

I really should have tried this out. Sorry. Yes, the maximal compute capability is 5.2, so if the compiler either supports 5.0 or 6.0, you should use 5.0.
Thus:

Code: Select all

CC          = mpicc  -acc -gpu=cc50,cuda11.7
FC          = mpif90 -acc -gpu=cc50,cuda11.7
FCL         = mpif90 -acc -gpu=cc50,cuda11.7 -c++libs
Let me know if this works,

Michael

bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

Re: vasp not detecting GPU

#9 Post by bratinag » Wed Apr 24, 2024 12:57 pm

Please ignore this. I had a wrong compiler module loaded. Sorry.

Regards.

Gvido

michael_wolloch
Global Moderator
Global Moderator
Posts: 109
Joined: Tue Oct 17, 2023 10:17 am

Re: vasp not detecting GPU

#10 Post by michael_wolloch » Wed Apr 24, 2024 1:49 pm

Hi Gvido,

I did now successfully compile vasp 6.4.3 with cc=50 and cuda12.3, using Nvidia HPC SDK version 24.1 and /arch/makefile.include.nvhpc_ompi_mkl_omp_acc

Quickly running a couple of tests from the fast test suite on a single A30 GPU, was successful, so I think you should get this to work on your M60s.

However, the performance will be bad on this hardware, as I alluded to earlier.

Please let me know if I can delete this post, I am not sure what this refers to since I also had the compiler problem with cc=52 and had to drop down to cc=50:
Please ignore this. I had a wrong compiler module loaded. Sorry.

Regards.

Gvido
Please let me know if you succeed, or have any other issues,
Michael

bratinag
Newbie
Newbie
Posts: 13
Joined: Tue Sep 13, 2022 12:30 pm

Re: vasp not detecting GPU

#11 Post by bratinag » Thu Apr 25, 2024 6:34 am

Hi Michael,

The compilation of the 6.4.3. went fine using the attached makefile.include. The queue on the HPC facility is packed, so that I can't update you on the speed of calculation. I'll let you know as soon as it runs through, but you can close this ticket. Thank you so much for your help.

Regards.

Gvido
You do not have the required permissions to view the files attached to this post.

Locked