Finetuning example¶
Warning
Finetuning is currently only available for the PET architecture.
This is a simple example for fine-tuning PET-MAD (or a general PET model), that
can be used as a template for general fine-tuning with metatrain.
Fine-tuning a pretrained model allows you to obtain a model better suited for
your specific system. You need to provide a dataset of structures that have
been evaluated at a higher reference level of theory, usually DFT. Fine-tuning
a universal model such as PET-MAD allows for reasonable model performance even if little training
data is available.
It requires using a pre-trained model checkpoint with the mtt train
command and setting the
new targets corresponding to the new level of theory in the options.yaml
file.
In order to obtain a pretrained model, you can use a PET-MAD checkpoint from huggingface
wget https://huggingface.co/lab-cosmo/pet-mad/resolve/v1.1.0/models/pet-mad-v1.1.0.ckpt
Next, we set up the options.yaml
file. We can specify the fine-tuning method
in the finetune
block in the training
options of the architecture
.
Here, the basic full
option is chosen, which finetunes all weights of the model.
All available fine-tuning methods are found in the advanced concepts
Fine-tuning. This section discusses implementation details,
options and recommended use cases. Other fine-tuning options can be simply substituted in this script,
by changing the finetune
block.
Furthermore, you need to specify the checkpoint, that you want to fine-tune in
the read_from
option.
A simple options.yaml
file for this task could look like this:
Training on a new level of theory is a common use case for transfer learning. Let’s
architecture:
name: pet
training:
num_epochs: 1000
learning_rate: 1e-5
finetune:
method: full
read_from: path/to/checkpoint.ckpt
training_set:
systems:
read_from: dataset.xyz
reader: ase
length_unit: angstrom
targets:
energy:
quantity: energy
read_from: dataset.xyz
reader: ase
key: energy
unit: eV
forces:
read_from: dataset.xyz
reader: ase
key: forces
stress:
read_from: dataset.xyz
reader: ase
key: stress
test_set: 0.1
validation_set: 0.1
In this example, we specified generic but reasonable num_epochs
and learning_rate
parameters. The learning_rate
is chosen to be relatively low to stabilise
training.
Warning
Note that in targets
we use the PET-MAD energy
head. This means, that there won’t be a new head
for the new reference energies provided in your dataset. This can lead to bad performance, if the reference
energies differ from the ones used in pretraining (different levels of theory, or different electronic structure
software used). In future it is recommended to create a new energy
target for the new level of theory.
Find more about this in Transfer-Learning
We assumed that the pre-trained model is trained on the dataset dataset.xyz
in which
energies are written in the energy
key of the info
dictionary of the
energies. Additionally, forces and stresses should be provided with corresponding keys
which you can specify in the options.yaml
file under targets
.
Further information on specifying targets can be found in Customize a Dataset Configuration.
Note
It is important that the length_unit
is set to angstrom
and the energy
unit
is eV
in order
to match the units PET-MAD was trained on. If your dataset has different energy units, it is
necessary to convert it to eV
before fine-tuning.
After setting up your options.yaml
file, finetuning can then simply be run
via mtt train options.yaml
.
Further fine-tuning examples can be found in the AtomisticCookbook