Running long ML-AIMD trajectories; memory limitations

Message

kirk3141 · #1 Post by **kirk3141** » Mon Mar 03, 2025 11:02 pm

Hi

I have a decent VASP MLFF for LiCoO2, made with about 2000 reference configurations. I plan to use it to run some long MD trajectories at temperature. As a comparison, I'd like to do the same thing with ML-AIMD where the training stays on, but maybe allow VASP to take up to 1000 (or more) MD steps between DFT calculations. I'd like to keep the "ground-truth" of DFT calculations available in case the MLFF error crosses the threshold. Already my MLFF consumed a lot of memory. This could probably be improved given that I am new at this. But if I want to run ML-AIMD for a long time say ~300 ps or more, with training on so that DFT is available when needed, I assume I'm likely to run out of memory eventually. Does anyone have any suggestions for keeping memory under control in this situation?

Thanks much
Kirk

#2 Post by **ferenc_karsai** » Tue Mar 04, 2025 9:22 am

I would not do production calculations with training on:

1) If during your production calculation possible sampling of data and refitting is enabled then it is equivalent to changing your Hamiltonian during the production runs. This is from a stistical physics point of view a no-go.
2) You would have to use on-the-fly training. This mode (ML_MODE=train) has to use a different algorithm than the force-field only mode (ML_MODE=run after refitting with ML_MODE=refit) which is 20-100 times slower.

I would suggest to you to run pure force field calculations with ML_MODE=run and enable monitoring of the extrapolation of the force field via the spilling factor. Usually it is enough to monitor the error every 20-100th MD steps, since the atoms are not moving enough in between. For that set something like ML_IERR=50 (note that in the upcoming release ML_IERR will be renamed to ML_ESTBLOCK). The spilling factor is calculated for every atom and it approaches 0 if the force-field is interpolating on the current structure. If the force field has to interpolate than it rather quickly approaches 1. So you would monitor spikes during your trajectory and can possibly manually add structures which showed strong spiking to your training set and refit.

kirk3141 · #3 Post by **kirk3141** » Tue Mar 04, 2025 10:12 pm

Ok thank you so much for your very helpful perspective.

Also, just to confirm, if the FF is interpolating between configuration for the current structure, the spilling factor approaches 0, but if extrapolating, it quickly approaches 1?

Thank you!

#4 Post by **ferenc_karsai** » Wed Mar 05, 2025 11:21 am

Yes, if the structure is more or less "contained" among the local reference configurations (or interpolating) the spilling factor becomes 0 and vice versa.

Please mind that the spilling factor depends (quite strongly) on the element types and structure. So it's best to monitor the change of the spilling factor during a calculation.

kirk3141 · #5 Post by **kirk3141** » Wed Mar 05, 2025 10:03 pm

Ok, excellent, will do ... thank you!

My Community

Running long ML-AIMD trajectories; memory limitations

Running long ML-AIMD trajectories; memory limitations

Re: Running long ML-AIMD trajectories; memory limitations

Re: Running long ML-AIMD trajectories; memory limitations

Re: Running long ML-AIMD trajectories; memory limitations

Re: Running long ML-AIMD trajectories; memory limitations