Page 1 of 1

NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod

Posted: Fri Oct 25, 2024 6:04 pm
by santanumahapatra

I am trying to install vasp 6.4.3, but I get the following error:
NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod.
I have checked that this qdmodule.mod is indeed present in the mentioned path : /opt/nvidia/hpc_sdk/Linux_x86_64/24.9/compilers/extras/qd/include/qd

I have used make veryclean and repeated installation mutiple times, but no resolution.

Following is my makefile.include, kindly suggest.

Code: Select all

# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxNV\" \
              -DMPI -DMPI_BLOCK=8000 -Duse_collective \
              -DscaLAPACK \
              -DCACHE_SIZE=4000 \
              -Davoidalloc \
              -Dvasp6 \
              -Duse_bse_te \
              -Dtbdyn \
              -Dqd_emulate \
              -Dfock_dblbuf \
              -D_OPENACC \
              -DUSENCCL -DUSENCCLP2P \
              -Duse_shmem

CPP         = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX)  > $*$(SUFFIX)

# N.B.: you might need to change the cuda-version here
#       to one that comes with your NVIDIA-HPC SDK
FC          = mpif90 -acc -gpu=cc70,cuda12.6
FCL         = mpif90 -acc -gpu=cc70,cuda12.6 -c++libs

FREE        = -Mfree

FFLAGS      = -Mbackslash -Mlarge_arrays

OFLAG       = -fast

DEBUG       = -Mfree -O0 -traceback

OBJECTS     = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o

LLIBS       = -cudalib=cublas,cusolver,cufft,nccl -cuda

# Redefine the standard list of O1 and O2 objects
SOURCE_O1  := pade_fit.o
SOURCE_O2  := pead.o

# For what used to be vasp.5.lib
CPP_LIB     = $(CPP)
FC_LIB      = nvfortran
CC_LIB      = nvc -w
CFLAGS_LIB  = -O
FFLAGS_LIB  = -O1 -Mfixed
FREE_LIB    = $(FREE)

OBJECTS_LIB = linpack_double.o getshmem.o

# For the parser library
CXX_PARS    = nvc++ --no_warnings

##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##
# When compiling on the target machine itself , change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU = -tp host
FFLAGS     += $(VASP_TARGET_CPU)

# Specify your NV HPC-SDK installation (mandatory)
#... first try to set it automatically
NVROOT      =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }')

# If the above fails, then NVROOT needs to be set manually
#NVHPC      ?= /opt/nvidia/hpc_sdk
#NVVERSION   = 21.11
#NVROOT      = $(NVHPC)/Linux_x86_64/$(NVVERSION)

## Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2
OFLAG_IN   = -fast -Mwarperf
SOURCE_IN  := nonlr.o

# Software emulation of quadruple precsion (mandatory)
QD          = $(NVROOT)/compilers/extras/qd
LLIBS      += -L$(QD)/lib -lqdmod -lqd
INCS       += -I$(QD)/include/qd

# BLAS (mandatory)
BLAS        = -lblas

# LAPACK (mandatory)
LAPACK      = -llapack

# scaLAPACK (mandatory)
SCALAPACK   = -Mscalapack

LLIBS      += $(SCALAPACK) $(LAPACK) $(BLAS)

# FFTW (mandatory)
FFTW_ROOT   = /usr/local
LLIBS      += -L$(FFTW_ROOT)/lib -lfftw3
INCS       += -I$(FFTW_ROOT)/include

Regards


Re: NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod

Posted: Mon Oct 28, 2024 8:07 am
by jonathan_lahnsteiner2

Dear santanumahapatra,

Please sent the error message which was written by your compiler. And also send the exact tool chain you are using. The compiler and version, the used mpi and so on.
With the current information I am not able to tell what is going on.

All the best Jonathan


Re: NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod

Posted: Tue Oct 29, 2024 4:06 am
by santanumahapatra

Hello, I have got the issue fixed, however I have a new problem when trying to run vasp calulations

Code: Select all

[[13956,1],3]: A high-performance Open MPI point-to-point messaging module
was unable to find any relevant network interfaces:

Module: OpenFabrics (openib)
  Host: 381c00608ee8

Another transport will be used instead, although this may result in
lower performance.

NOTE: You can disable this warning by setting the MCA parameter
btl_base_warn_component_unused to 0.
--------------------------------------------------------------------------
 running    4 mpi-ranks, on    1 nodes
 distrk:  each k-point on    4 cores,    1 groups
 distr:  one band on    1 cores,    4 groups
 OpenACC runtime initialized ...    4 GPUs detected
[381c00608ee8:00031] 3 more processes have sent help message help-mpi-btl-base.txt / btl:no-nics
[381c00608ee8:00031] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
 -----------------------------------------------------------------------------
|                     _     ____    _    _    _____     _                     |
|                    | |   |  _ \  | |  | |  / ____|   | |                    |
|                    | |   | |_) | | |  | | | |  __    | |                    |
|                    |_|   |  _ <  | |  | | | | |_ |   |_|                    |
|                     _    | |_) | | |__| | | |__| |    _                     |
|                    (_)   |____/   \____/   \_____|   (_)                    |
|                                                                             |
|     internal error in: mpi.F  at line: 903                                  |
|                                                                             |
|     M_init_nccl: Error in ncclCommInitRank                                  |
|                                                                             |
|     If you are not a developer, you should not encounter this problem.      |
|     Please submit a bug report.                                             |
|                                                                             |
 -----------------------------------------------------------------------------

Re: NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod

Posted: Tue Oct 29, 2024 4:35 pm
by jonathan_lahnsteiner2

Dear santanumahapatra,

You could try to recompile VASP without NCCL and see if this error goes away. You can do this by removing -DUSENCCL from your makefile.include. This is likely to solve the problem because the error occurs during NCCL initializtaion.

All the Best Jonathan


Re: NVFORTRAN-F-0004-Unable to open MODULE file qdmodule.mod

Posted: Wed Oct 30, 2024 11:02 am
by jonathan_lahnsteiner2

Dear santanumahapatra,

This issue also occurred for other users in this post. To solve this issue I wanted to ask if you could analyse the problem further. This would be of great help to us. You could try setting the environment variable NCCL_DEBUG=WARN or TRACE. VASP should still be compiled with -DUSENCCL. Then rerunning VASP should produce output which should give further insight into the issue. Another option would be to print the ncclRes variable from M_init_nccl to see what error code it returns, but that would require recompiling the code.

All the best Jonathan