Segfault when cuda_mem.cu frees memory
Posted: Thu Mar 31, 2016 6:04 am
Hi,
I wished to ask the VASP user community if anybody using VASP on GPUs (Tesla K20X on the Cray XC30 in my case) had a similar issue.
When running the simple CeO2 test of Peter Larsson's test suite on a single core, I got a segmentation fault at line 71 of cuda_mem.cu, when the code tries to free the memory on the device (#else free(*ptr);)
Process 0:
Thread 1 stopped in free from /lib64/libc.so.6 with signal SIGSEGV (Segmentation fault).
Reason/Origin: address not mapped to object (attempt to access invalid address)
I needed to modify cuda_mem.cu, since the original one with the define statements for NVREGISTERSELF and NVPINNED was giving other error "Failed to free pinned memory!” coded in cuda_mem.cu in the lines immediately before the statement on line 71.
I use the INCAR provided by the VASP test suite, with NCORE=1 and without defining NPAR.
I would appreciate any advice from the VASP community.
Thanks,
Luca
I wished to ask the VASP user community if anybody using VASP on GPUs (Tesla K20X on the Cray XC30 in my case) had a similar issue.
When running the simple CeO2 test of Peter Larsson's test suite on a single core, I got a segmentation fault at line 71 of cuda_mem.cu, when the code tries to free the memory on the device (#else free(*ptr);)
Process 0:
Thread 1 stopped in free from /lib64/libc.so.6 with signal SIGSEGV (Segmentation fault).
Reason/Origin: address not mapped to object (attempt to access invalid address)
I needed to modify cuda_mem.cu, since the original one with the define statements for NVREGISTERSELF and NVPINNED was giving other error "Failed to free pinned memory!” coded in cuda_mem.cu in the lines immediately before the statement on line 71.
I use the INCAR provided by the VASP test suite, with NCORE=1 and without defining NPAR.
I would appreciate any advice from the VASP community.
Thanks,
Luca