Very simple query on hybrid parallelization
Moderators: Global Moderator, Moderator
-
- Jr. Member
- Posts: 81
- Joined: Wed Sep 28, 2011 4:15 pm
- License Nr.: 5-1441
- Location: Germany
Very simple query on hybrid parallelization
Does the ScGW method work with the explicit openmp + mpi parallelization ? Hoping for a quick response from the develpers. This might help me overcome the massive memory requirements for some loss in performance.
-
- Administrator
- Posts: 2921
- Joined: Tue Aug 03, 2004 8:18 am
- License Nr.: 458
Re: Very simple query on hybrid parallelization
Hybrid parallelization (HP) of the GW code is under development.
In the current version the HP is not working.
Hint#1: Memory sharing can be tuned by precompiler-flags
-Duse_shmem NCSHMEM = integer (integer .le. cores-per-node, or integer .eq. cores-per-socket)
Hint#2: One can perform memory-demanding calculations in a supercomputing center.
Many of them in Germany have the vasp installed.
In the current version the HP is not working.
Hint#1: Memory sharing can be tuned by precompiler-flags
-Duse_shmem NCSHMEM = integer (integer .le. cores-per-node, or integer .eq. cores-per-socket)
Hint#2: One can perform memory-demanding calculations in a supercomputing center.
Many of them in Germany have the vasp installed.
-
- Jr. Member
- Posts: 81
- Joined: Wed Sep 28, 2011 4:15 pm
- License Nr.: 5-1441
- Location: Germany
Re: Very simple query on hybrid parallelization
Thank you for the reply. I am already on the JURECA system at Juelich in Germany, on which I installed vasp with the help of the support from the admins. The partially self consistent GW calculations for the Slab systems (50 atoms, 108 occupied bands, 288 total bands, encutgw = 200 eV) that I want to calculate require > 50GB per core when using 288 mpi tasks when not using any kpoint parallelization and lspectral=.true. As you know already, lspectral=.false. is very slow and it doesnt get me anywhere in 24 hours for this system.
Yesterday I anyway compiled the hybrid version of vasp with the flags FC=mpif90 -openmp -openmp-report2. With lspectral=.ture. It is already performing updates of the chi_q(r,r) and seems to be running in hybrid mode. I will have to see if the results match from those ofpure mpi jobs, for some smaller pure mpi calculable systems.
Yesterday I anyway compiled the hybrid version of vasp with the flags FC=mpif90 -openmp -openmp-report2. With lspectral=.ture. It is already performing updates of the chi_q(r,r) and seems to be running in hybrid mode. I will have to see if the results match from those ofpure mpi jobs, for some smaller pure mpi calculable systems.
-
- Jr. Member
- Posts: 81
- Joined: Wed Sep 28, 2011 4:15 pm
- License Nr.: 5-1441
- Location: Germany
Re: Very simple query on hybrid parallelization
1) So, there is no difference between running vasp with hybrid because there is no line in the source code which has OpenMP active. I also tested that implicit OpenMP parallelization of the VASP code with which was compiled with the intel mkl library. There was still no difference in the the performance between 4 pure mpi processes.and 4 mpi processes with 6 MKL_NUM_THREADS (over a total of 24 physical cores). The GW method performs a lot of BLAS operations during its course of operations but I still did not find any effect of using (as recommended on the intel mkl page https://software.intel.com/en-us/articl ... lications/ )
export MKL_NUM_THREADS=1
export MKL_DOMAIN_NUM_THREADS="MKL_BLAS=6"
export OMP_NUM_THREADS=1
export MKL_DYNAMIC="TRUE"
It would be very kind if the VASP developers could throw some light on this. Even if I do not compile the code for explicit hybrid parallelization, the intel mkl routines must still work, right ? Any ideas how could I modify my compilation to make this work?
2) In my compilation I use
CPP_OPTIONS= -DMPI -DHOST=\"IFC91_ompi\" -DIFC \
-DCACHE_SIZE=4000 -DPGF90 -Davoidalloc \
-DMPI_BLOCK=65536 -DscaLAPACK -Duse_collective \
-DnoAugXCmeta -Duse_bse_te \
-Duse_shmem -Dtbdyn -DVASP2WANNIER90
As you can see the -Duse_shmem flag is already there. Did you mean another was of specifying it ? I tried specifying the following as my system has 24 cores per node
-Duse_shmem 24
-Duse_shmem NCSHMEM = 24
However, I always get the error while compilation:
--------------------------------------------------------------------------------------
fpp: fatal: Usage: fpp [-flags]... [filein [fileout]]
gmake[2]: *** [base.f90] Error 1
gmake[2]: Leaving directory `/homea/jhpc36/jhpc3601/software/VASP/vasp/vasp.5.4.1_wannier1.2_hybrid/build/std'
cp: cannot stat ‘vasp’: No such file or directory
gmake[1]: *** [all] Error 1
--------------------------------------------------------------------------------------
May be I am specifying with the wrong syntax. It would be very kind if you could please answer these two points and give some hints.
Thanks and Best Regards
-ask
export MKL_NUM_THREADS=1
export MKL_DOMAIN_NUM_THREADS="MKL_BLAS=6"
export OMP_NUM_THREADS=1
export MKL_DYNAMIC="TRUE"
It would be very kind if the VASP developers could throw some light on this. Even if I do not compile the code for explicit hybrid parallelization, the intel mkl routines must still work, right ? Any ideas how could I modify my compilation to make this work?
2) In my compilation I use
CPP_OPTIONS= -DMPI -DHOST=\"IFC91_ompi\" -DIFC \
-DCACHE_SIZE=4000 -DPGF90 -Davoidalloc \
-DMPI_BLOCK=65536 -DscaLAPACK -Duse_collective \
-DnoAugXCmeta -Duse_bse_te \
-Duse_shmem -Dtbdyn -DVASP2WANNIER90
As you can see the -Duse_shmem flag is already there. Did you mean another was of specifying it ? I tried specifying the following as my system has 24 cores per node
-Duse_shmem 24
-Duse_shmem NCSHMEM = 24
However, I always get the error while compilation:
--------------------------------------------------------------------------------------
fpp: fatal: Usage: fpp [-flags]... [filein [fileout]]
gmake[2]: *** [base.f90] Error 1
gmake[2]: Leaving directory `/homea/jhpc36/jhpc3601/software/VASP/vasp/vasp.5.4.1_wannier1.2_hybrid/build/std'
cp: cannot stat ‘vasp’: No such file or directory
gmake[1]: *** [all] Error 1
--------------------------------------------------------------------------------------
May be I am specifying with the wrong syntax. It would be very kind if you could please answer these two points and give some hints.
Thanks and Best Regards
-ask