OOM during Wannier90 AMN computation

Problems running VASP: crashes, internal errors, "wrong" results.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
francesco_martinelli
Newbie
Newbie
Posts: 2
Joined: Fri Feb 16, 2024 12:58 pm

OOM during Wannier90 AMN computation

#1 Post by francesco_martinelli » Tue Nov 05, 2024 8:05 am

Hi all,

I'm trying to produce the input file for a Wannierization starting from a SOC converged calculations.
Unfortunately, the calculation runs OOM when it calculates the AMN projections.

Here you have the INCAR and the calculation output until it stops.
It would be great if someone could help me solve this problem.

Best,
Francesco

Code: Select all

Workflow INCAR

#General tags
PREC = Accurate
ENCUT = 600
EDIFF = 1E-8
LORBMOM = TRUE
#KPAR = 4
#NCORE = 8
GGA = PE
ISMEAR = 0
SIGMA = 0.01
NELM = 300
LORBIT = 11
NBANDS = 128
LDAUPRINT = 2
LMAXMIX = 4
LASPH = TRUE

#DOS
NEDOS = 2001

#Magnetism
LSORBIT = TRUE
ISYM = -1
SAXIS = 0 0 1
ISPIN = 2
LNONCOLLINEAR = TRUE
RWIGS = 1.9 1.0 1.5 0.4
MAGMOM = 6*0 3*0 0.0 0.0 0 18*0

#Wannier90
LWANNIER90 = TRUE
NUM_WANN = 6
ICHARG = 11
LCHARG = FALSE

WANNIER90_WIN = "
exclude_bands = 1-76,83-128
guiding_centres = T

begin projections
Re:dxz,dyz,dxy
end projections

dis_win_min = 3
dis_win_max = 7
write_hr = true
write_u_matrices = true
write_xyz = true

num_iter = 0
conv_tol = 1E-9
conv_window = 10
dis_num_iter = 0
dis_conv_tol = 1E-9
dis_conv_window = 10
bands_plot = true

begin kpoint_path
G  .0    .0    .0     X  .5    .0    .5
X  .5    .0    .5     W  .5    .25   .75
W  .5    .25   .75    L  .5    .5    .5
L  .5    .5    .5     G  .0    .0    .0
G  .0    .0    .0     K  .375  .375  .75
end kpoint_path
bands_num_points 40"

Code: Select all

 running  128 mpi-ranks, with    1 threads/rank, on    1 nodes
 distrk:  each k-point on  128 cores,    1 groups
 distr:  one band on    1 cores,  128 groups
 vasp.6.4.2 20Jul23 (build Jul 22 2024 14:06:31) complex

 POSCAR found type information on POSCAR BaMgReO
 POSCAR found :  4 types and      10 ions
 Reading from existing POTCAR
 scaLAPACK will be used
 Reading from existing POTCAR
 LDA part: xc-table for Pade appr. of Perdew
 WARNING: stress and forces are not correct
 POSCAR, INCAR and KPOINTS ok, starting setup
 FFT: planning ... GRIDC
 FFT: planning ... GRID_SOFT
 FFT: planning ... GRID
 WAVECAR not read
 reading imaginary part of occupancies ...
 charge-density read from file: unknown
 reading imaginary part of occupancies ...
 magnetization density read from file 1
 reading imaginary part of occupancies ...
 magnetization density read from file 2
 reading imaginary part of occupancies ...
 magnetization density read from file 3
 entering main loop
       N       E                     dE             d eps       ncg     rms          rms(c)
DAV:   1     0.311775800494E+03    0.31178E+03   -0.52590E+04131072   0.145E+03
DAV:   2    -0.695837669671E+02   -0.38136E+03   -0.38127E+03196608   0.208E+02
DAV:   3    -0.727009882628E+02   -0.31172E+01   -0.31172E+01131072   0.276E+01
DAV:   4    -0.727166305005E+02   -0.15642E-01   -0.15642E-01262144   0.354E+00
DAV:   5    -0.727166532600E+02   -0.22759E-04   -0.22759E-04131072   0.752E-02
DAV:   6    -0.727166534213E+02   -0.16125E-06   -0.16112E-06262144   0.743E-03
DAV:   7    -0.727166534224E+02   -0.10905E-08   -0.11155E-08131072   0.349E-04
 Calling wannier_setup of wannier90 in library mode
 Wannier90 mode
 Computing MMN (overlap matrix elements)
 Computing AMN (projections onto localized orbitals)

henrique_miranda
Global Moderator
Global Moderator
Posts: 505
Joined: Mon Nov 04, 2019 12:41 pm
Contact:

Re: OOM during Wannier90 AMN computation

#2 Post by henrique_miranda » Wed Nov 06, 2024 9:15 am

Hi Francesco,

My suggestion would be to start with a calculation that you can run quickly and does not go out of memory:

  1. Use the default ENCUT which is chosen based on the maximum value of ENMAX in the POTCAR file
  2. Reduce the number of k-points (I don't see how many you are using currently because you did not share the KPOINTS file)
  3. Reduce the number of bands and MPI ranks but still reserve the full node. I see you are setting manually NBANDS=128, but then you're excluding them from the Wannierization procedure, perhaps you can reduce the number of bands. If you still reserve the full node, then this should increase the amount of memory available per MPI rank.

Once you have a calculation that runs, then you can gradually increase KPOINTS and ENCUT and compare the results with the calculation that ran.
My suspicion is that you can still get very accurate results with lower ENCUT and KPOINTS. If that is not the case, then you need to increase the amount of memory per MPI rank as I mentioned in point 3.

Let me know if this helps.


francesco_martinelli
Newbie
Newbie
Posts: 2
Joined: Fri Feb 16, 2024 12:58 pm

Re: OOM during Wannier90 AMN computation

#3 Post by francesco_martinelli » Wed Nov 13, 2024 10:23 am

Hi Henrique,

First of all, thank you for your help, now the interface is running correctly (with fewer bands, lower cutoff etc.), but I'd like to share with you the .win file generated during the calculation which led to the OOM.
Here in the 'generated automatically by VASP' section it printed out num_bands = 128 even if exclude_bands = 1-76,83-128 was stated before, while if I repeat the same calculations on a different machine (with an increased memory) the OOM doesn't happen and the num_bands tag is set to 6.
Is it only a memory-related issue or could be caused by a different/erroneous Vasp compilation on the machine that gives the OOM?
Thank you in advance for the reply.

Code: Select all

exclude_bands = 1-76,83-128
guiding_centres = T

begin projections
Re:dxz,dyz,dxy
end projections

dis_win_min = 3
dis_win_max = 7
write_hr = true
write_u_matrices = true
write_xyz = true

num_iter = 0
conv_tol = 1E-9
conv_window = 10
dis_num_iter = 0
dis_conv_tol = 1E-9
dis_conv_window = 10
bands_plot = true

begin kpoint_path
G  .0    .0    .0     X  .5    .0    .5
X  .5    .0    .5     W  .5    .25   .75
W  .5    .25   .75    L  .5    .5    .5
L  .5    .5    .5     G  .0    .0    .0
G  .0    .0    .0     K  .375  .375  .75
end kpoint_path
bands_num_points 40
# This part was generated automatically by VASP
num_bands = 128
num_wann = 6
spinors = .true.
begin unit_cell_cart
     4.0401000     0.0000000     4.0401000
     4.0401000     4.0401000     0.0000000
     0.0000000     4.0401000     4.0401000
end unit_cell_cart
begin atoms_cart
Ba       2.0200500     2.0200500     2.0200500
Ba       6.0601500     6.0601500     6.0601500
Mg       4.0401000     4.0401000     4.0401000
Re       0.0000000     0.0000000     0.0000000
O        1.9260773     4.0401000     4.0401000
O        4.0401000     1.9260773     4.0401000
O        4.0401000     4.0401000     1.9260773
O        6.1541227     4.0401000     4.0401000
O        4.0401000     6.1541227     4.0401000
O        4.0401000     4.0401000     6.1541227
end atoms_cart
mp_grid =     8     8     8
begin kpoints
      0.000000000000      0.000000000000      0.000000000000
      0.125000000000      0.000000000000      0.000000000000
      0.250000000000      0.000000000000      0.000000000000
      0.375000000000     -0.000000000000     -0.000000000000
      ...
      end kpoints
Last edited by francesco_martinelli on Wed Nov 13, 2024 10:25 am, edited 1 time in total.

henrique_miranda
Global Moderator
Global Moderator
Posts: 505
Joined: Mon Nov 04, 2019 12:41 pm
Contact:

Re: OOM during Wannier90 AMN computation

#4 Post by henrique_miranda » Mon Nov 18, 2024 9:42 am

Good to hear that you were able to run your calculations!
Starting any calculation from low ENCUT and KPOINTS settings is one of the most simple and important advice I can give.
Once you obtain the final result you intend to get, in your case Wannier functions, then you can easily run new calculations with larger ENCUT and KPOINTS and check if these final results change.
In some cases, you might find that instead of using a large ENCUT or KPOINTS 'to be safe' and spending more computational resources, you can get very accurate results with a lower ENCUT or KPOINTS.

The short response is: that can happen when the code is stopped abruptly.

The long response is very technical:
In the interface between VASP and wannier90 there are two stages, setup and execution.
In the setup step, wannier90 reads the wannier90.win file and returns some data present on it like index of the excluded bands, projections, etc.
This wannier90.win file is created by VASP before the setup call.
There is a little chicken-and-egg problem here because to know the number of bands (num_bands) we need to read which bands to exclude (exclude_bands) from the wannier90.win file.
To solve this we first write a wannier90.win with num_bands=NBANDS, then before the execution step we rewrite it with num_bands=NBANDS-num_exclude_bands with num_exclude_bands the total number of bands that are excluded which we read in the setup step.
Note that wannier90 only needs to use num_bands in the execution step.

Now, it is between the setup and execution step that VASP computes the AMN and MMN information to write to the AMN and MMN files.
If the code goes OOM there, then the wannier90.win file will not be updated with the correct num_bands.

I hope this answers your question!


francesco_martinelli
Newbie
Newbie
Posts: 2
Joined: Fri Feb 16, 2024 12:58 pm

Re: OOM during Wannier90 AMN computation

#5 Post by francesco_martinelli » Mon Nov 18, 2024 9:58 am

Great! Thank you for your explanation!


Post Reply