Queries about input and output files, running specific calculations, etc.
Moderators: Global Moderator, Moderator
-
ivan.kondov@kit.edu
- Newbie

- Posts: 5
- Joined: Tue Dec 17, 2019 3:08 pm
#1
Post
by ivan.kondov@kit.edu » Fri Mar 28, 2025 1:46 pm
I would like to merge two or more ML_AB files coming from training runs of different systems with the same two elements. I strictly followed the approach suggested here [1] using ML_MODE = select. I performed all described manupulations on the two original ML_AB files and concatenated them into one file. After running with ML_MODE = select only the configurations of the first original ML_AB file have been processed and the ones from the second original ML_AB files were skipped. The newly produced ML_ABN file includes only the configrations from the first original file.
Maybe I have overseen some detail not mentioned in the wiki?
[1] https://www.vasp.at/wiki/index.php/ML_A ... L_AB_files
-
ivan.kondov@kit.edu
- Newbie

- Posts: 5
- Joined: Tue Dec 17, 2019 3:08 pm
#3
Post
by ivan.kondov@kit.edu » Mon Mar 31, 2025 12:16 pm
Thank you for your prompt answer! Attached a minimal working example. I omitted only the POTCAR files for license reasons.
You do not have the required permissions to view the files attached to this post.
-
marie-therese.huebsch
- Full Member

- Posts: 242
- Joined: Tue Jan 19, 2021 12:01 am
#4
Post
by marie-therese.huebsch » Wed Apr 02, 2025 1:41 pm
I am confused because it seems you did not do any of the manipulations. When I look at combined/ML_AB in your example there are 3 problems
The configurations need to be enumerated. That is
should yield
Code: Select all
Configuration num. 1
Configuration num. 2
Configuration num. 3
Configuration num. 4
Configuration num. 5
Configuration num. 6
in your case. But it yields
Code: Select all
Configuration num. 1
Configuration num. 2
Configuration num. 3
Configuration num. 1
Configuration num. 2
Configuration num. 3
in the file you uploaded.
Did you upload the wrong example? Anyway, please let me know if applying the suggested changes solves the issue.
Best regards,
Marie-Therese
-
ivan.kondov@kit.edu
- Newbie

- Posts: 5
- Joined: Tue Dec 17, 2019 3:08 pm
#5
Post
by ivan.kondov@kit.edu » Thu Apr 03, 2025 9:17 am
Thank you!
It is the right example. I have followed exactly this text [1]:
"The lists of local reference configurations cannot be easily merged (renumbering would be required). Instead, it is recommended to recalculate them using ML_MODE = select. However, to start with a valid ML_AB file first manually set The numbers of basis sets per atom type to 1 for each species. Also, set the block Basis set for X with dummy value 1 1 for each species. After running with ML_MODE = select the output ML_ABN will contain the selected new local reference configurations for the combined training data."
I will try to do these additional manipulations but even if they work I cannot imageing editing my production ML_AB files - they are huge.
Is it not easier to have a script from "first hand" that does the merge instead of editing extremely large files with complex syntax?
-
marie-therese.huebsch
- Full Member

- Posts: 242
- Joined: Tue Jan 19, 2021 12:01 am
#6
Post
by marie-therese.huebsch » Mon Apr 07, 2025 9:07 am
Dear Ivan,
Sorry for the inconvenience. Currently, there is no in-house script that we can offer. It is important though to follow all details that are provided in the section in merging ML_AB files:
Multiple ML_AB files may be merged by hand, keeping the following restrictions and tips in mind:
The training structure data can be simply concatenated, i.e., by just adding more structure sections starting with Configuration num. n at the end of the file. However, the structure numbering needs to be updated in such a way that they are enumerated continuously starting from 1.
We strongly advise to group structures with the same number of elements and atoms per element in the training data together, otherwise the code will automatically reorder the data, such that those are sticking together. If one relies on the automatic reordering it will not be possible to easily "diff" the input ML_AB file and its corresponding ML_ABN output file.
The header must be adjusted to reflect the combined number of element types, the maximum number of atoms, etc.
The lists of local reference configurations cannot be easily merged (renumbering would be required). Instead, it is recommended to recalculate them using ML_MODE = select. However, to start with a valid ML_AB file first manually set The numbers of basis sets per atom type to 1 for each species. Also, set the block Basis set for X with dummy value 1 1 for each species. After running with ML_MODE = select the output ML_ABN will contain the selected new local reference configurations for the combined training data.
Tip: If calculations for ML_MODE = select are too time consuming using the default settings it is useful to increase ML_MCONF_NEW to values around 10-16 and set ML_CDOUB = 4. This often accelerates the calculations by a factor of 2-4.
We are working on a solution to combine ML_AB files as a feature in VASP. This will be included in a future release. I am afraid I cannot immediately provide a solution for you. In principle, it is relatively straightforward to apply the description if the two ML_AB files you want to merge are known. (It is just hard to write a robust script that can handle all possible cases and provide good error handling.)
Best regards,
Marie-Therese