Note
Go to the end to download the full example code.
Generating and training an LLPR-derived shallow ensemble model¶
This tutorial demonstrates how to generate and, optionally, further train an LLPR-derived shallow ensemble model using metatrain. Building on the LLPR approach, this more advanced technique allows for 1) generation of a last-layer ensemble model from an LLPR model and 2) gradient-based tuning of ensemble weights using a negative log-likelihood (NLL) loss, often leading to improved uncertainty estimates at the cost of further training.
We first train a baseline model without uncertainties:
device: cpu
base_precision: 64
seed: 42
architecture:
name: soap_bpnn
training:
batch_size: 10
num_epochs: 10
learning_rate: 0.01
# Section defining the parameters for system and target data
training_set:
systems: qm9_reduced_100.xyz
targets:
energy:
key: U0
unit: hartree # very important to run simulations
validation_set: 0.1
test_set: 0.0
Then we create an LLPR ensemble model. This involves creating the LLPR model, which is very cheap (one pass through the training data without backpropagation), and then sampling last-layer ensemble weights using the LLPR covariance (extremely cheap), as explained in https://arxiv.org/html/2403.02251v1. Specifying num_ensemble_members enables the latter step in addition to the basic LLPR model.
device: cpu
base_precision: 64
seed: 42
architecture:
name: llpr
model:
num_ensemble_members: {energy: 32} # This enables the LLPR-sampled ensemble
training:
model_checkpoint: model.ckpt
batch_size: 4
# Section defining the parameters for system and target data
training_set:
systems: qm9_reduced_100.xyz
targets:
energy:
key: U0
unit: hartree
validation_set: 0.1
test_set: 0.0
In addition, you can decide to perform further backpropagation-based training on the resulting shallow ensemble, which is more expensive but can lead to better uncertainty estimates. This is done by setting num_epochs to the number of epochs that you want to train for.
device: cpu
base_precision: 64
seed: 42
architecture:
name: llpr
model:
num_ensemble_members: {energy: 32} # This enables the LLPR-sampled ensemble
training:
model_checkpoint: model.ckpt
batch_size: 4
num_epochs: 5 # This enables further backpropagation training of the LLPR ensemble
# Section defining the parameters for system and target data
training_set:
systems: qm9_reduced_100.xyz
targets:
energy:
key: U0
unit: hartree
validation_set: 0.1
test_set: 0.0
You can train these models yourself with the following code:
import subprocess
import ase.io
import numpy as np
from metatomic.torch import ModelOutput
from metatomic.torch.ase_calculator import MetatomicCalculator
We first train the baseline model without uncertainties, then train the LLPR ensemble models.
# Here, we run training as a subprocess. In practice, you would run this from
# the command line, e.g., ``mtt train options-model.yaml -o model.pt``.
print("Training baseline model...")
subprocess.run(["mtt", "train", "options-model.yaml", "-o", "model.pt"], check=True)
print("Training LLPR ensemble model...")
subprocess.run(
["mtt", "train", "options-llpr-ensemble.yaml", "-o", "model-llpr-ens.pt"],
check=True,
)
print("Training LLPR ensemble model with further backpropagation...")
subprocess.run(
["mtt", "train", "options-llpr-ensemble-train.yaml", "-o", "model-llpr-ens-tr.pt"],
check=True,
)
Training baseline model...
Training LLPR ensemble model...
Training LLPR ensemble model with further backpropagation...
CompletedProcess(args=['mtt', 'train', 'options-llpr-ensemble-train.yaml', '-o', 'model-llpr-ens-tr.pt'], returncode=0)
You can now use the uncertainties from the LLPR, as well as the ensemble model, as follows.
# Load some test structures
structures = ase.io.read("ethanol_reduced_100.xyz", ":5")
# Load the ensemble-trained model
calc = MetatomicCalculator("model-llpr-ens.pt", extensions_directory="extensions/")
# Get predictions with both ensemble and analytical uncertainties
# (note that all these quantities are also available per-atom with ``per_atom=True``)
predictions = calc.run_model(
structures,
{
"energy": ModelOutput(per_atom=False),
"energy_uncertainty": ModelOutput(per_atom=False), # LLPR analytical
"energy_ensemble": ModelOutput(per_atom=False), # ensemble predictions
},
)
energies = predictions["energy"].block().values.squeeze().cpu().numpy()
llpr_uncertainties = (
predictions["energy_uncertainty"].block().values.squeeze().cpu().numpy()
)
ensemble_predictions = (
predictions["energy_ensemble"].block().values.squeeze().cpu().numpy()
)
# Calculate ensemble mean and standard deviation
ensemble_mean = np.mean(ensemble_predictions, axis=1)
ensemble_std = np.std(ensemble_predictions, axis=1)
print(f"Energies: {energies}")
print(f"LLPR analytical uncertainties: {llpr_uncertainties}")
print(f"Ensemble mean: {ensemble_mean}")
print(f"Ensemble std: {ensemble_std}")
Energies: [-154.98422258 -154.98419238 -154.98523923 -154.98111695 -154.97973745]
LLPR analytical uncertainties: [0.00816542 0.00742433 0.00558033 0.00600885 0.0057837 ]
Ensemble mean: [-154.98422258 -154.98419238 -154.98523923 -154.98111695 -154.97973745]
Ensemble std: [0.00774784 0.00755035 0.00529778 0.00613911 0.00533268]
Total running time of the script: (0 minutes 35.874 seconds)