Classifier (Experimental)

The Classifier architecture is an experimental “wrapper” architecture that enables classification tasks on top of pre-trained atomistic models. It takes a pre-trained checkpoint, freezes its backbone, and trains a small multi-layer perceptron (MLP) on top of the features extracted from the backbone.

The model extracts per-atom features from the frozen backbone, averages them to get system-level representations, and then passes them through the MLP for classification. The targets should be class probabilities as vectors, supporting both one-hot encodings (e.g., [1.0, 0.0, 0.0]) and soft/fractional targets (e.g., [0.7, 0.2, 0.1]). The loss function is a standard cross-entropy loss for classification.

The last layer in hidden_sizes can be set to a small value if the goal is to use it to extract features for low-dimensional visualization and/or collective variables.

Installation

To install this architecture along with the metatrain package, run:

pip install metatrain[classifier]

where the square brackets indicate that you want to install the optional dependencies required for classifier.

Default Hyperparameters

The description of all the hyperparameters used in classifier is provided further down this page. However, here we provide you with a yaml file containing all the default hyperparameters, which might be convenient as a starting point to create your own hyperparameter files:

architecture:
  name: experimental.classifier
  model:
    hidden_sizes:
    - 64
    - 64
    feature_layer_index: -1
  training:
    batch_size: 32
    num_epochs: 100
    learning_rate: 0.001
    warmup_fraction: 0.1
    model_checkpoint: null
    log_interval: 1
    checkpoint_interval: 100

Model hyperparameters

The parameters that go under the architecture.model section of the config file are the following:

ModelHypers.hidden_sizes: list[int] = [64, 64]

List of hidden layer sizes for the MLP. For example, [64, 32] creates a 2-layer MLP with 64 and 32 neurons respectively. The last hidden size should be set to a small number (generally one or two) if the goal is to extract collective variables.

ModelHypers.feature_layer_index: int = -1

Index of the MLP layer to be mapped to the ‘features’ output. Can be negative to index from the end. Default is -1 (the output layer).

Trainer hyperparameters

The parameters that go under the architecture.trainer section of the config file are the following:

TrainerHypers.batch_size: int = 32

Batch size for training.

TrainerHypers.num_epochs: int = 100

Number of training epochs.

TrainerHypers.learning_rate: float = 0.001

Learning rate for the optimizer.

TrainerHypers.warmup_fraction: float = 0.1

Fraction of total training steps used for learning rate warmup. The learning rate increases linearly from 0 to the base learning rate during this period, then follows a cosine annealing schedule.

TrainerHypers.model_checkpoint: str | None = None

Path to the pre-trained model checkpoint. This checkpoint’s backbone will be frozen and used for feature extraction.

TrainerHypers.log_interval: int = 1

Interval for logging training progress (in epochs).

TrainerHypers.checkpoint_interval: int = 100

Interval for saving checkpoints during training (in epochs).

References