LLPR¶
The LLPR architecture is a “wrapper” architecture that enables cheap uncertainty
quantification (UQ) via the last-layer prediction rigidity (LLPR) approach proposed
by Bigi et al. [1] It is compatible with the following
metatrain
models constructed from NN-based architectures: PET and SOAP-BPNN.
The implementation of the LLPR as a separate architecture within metatrain
allows the users to compute the uncertainties without dealing with the fine details
of the LLPR implementation.
Installation¶
To install the LLPR wrapper architecture, run the following from the root of the repository:
pip install metatrain[llpr]
This will install the architecture along with necessary dependencies.
Default Hyperparameters¶
The default hyperparameters for the PET model are:
architecture:
name: llpr
model:
ensembles:
means: {}
num_members: {}
training:
model_checkpoint: null
batch_size: 16
regularizer: null
Under training
, the following hyperparameters are defined:
model_checkpoint
: This should provide the checkpoint to the model for which the user wants to perform UQ based on the LLPR approach. Note that the model architecture must comply with the requirement that the last-layer features are exposed under the convention defined by metatrain.batch_size
: This defines the batch size used in the computation of last-layer features, covariance matrix, etc.regularizer
: This is the regularizer value \(\varsigma\) that is used in applying Eq. 24 of Bigi et al [1]:\[\sigma^2_\star = \alpha^2 \boldsymbol{\mathrm{f}}^{\mathrm{T}}_\star (\boldsymbol{\mathrm{F}}^{\mathrm{T}} \boldsymbol{\mathrm{F}} + \varsigma^2 \boldsymbol{\mathrm{I}})^{-1} \boldsymbol{\mathrm{f}}_\star\]If set to
null
, the internal routine will determine the smallest regularizer value that guarantees numerical stability in matrix inversion. Having exposed the formula here, we also note to the user that the training routine of the LLPR wrapper model finds the ideal global calibration factor \(\alpha\).
To perform uncertainty propagation, one could also generate an ensemble of weights
from the calibrated inverse covariance matrix from the LLPR formalism. To access this
feature within the architecture, one can interact with the following hyperparameters
under model
and under ensemble
:
means
: this accepts a dictionary of targets and the names of their corresponding last-layer weights. For example, in the case of energy trained with the defaultenergy
key in a PET model, the following could be the set of weights to provide:means: energy: - node_last_layers.energy.0.energy___0.weight - node_last_layers.energy.1.energy___0.weight - edge_last_layers.energy.0.energy___0.weight - edge_last_layers.energy.1.energy___0.weight
num_members
: this is a dictionary of targets and the corresponding number of ensemble members to sample. Note that a sufficiently large number of members (more than 16) are required for robust uncertainty propagation. (e.g.num_members: {energy: 128}
)