Labels#

class metatensor.torch.Labels(names: str | List[str] | Tuple[str, ...], values: Tensor)[source]#

A set of labels carrying metadata associated with a TensorMap.

The metadata can be though as a list of tuples, where each value in the tuple also has an associated dimension name. In practice, the dimensions names are stored separately from the values, and the values are in a 2-dimensional array integers with the shape (n_entries, n_dimensions). Each row/entry in this array is unique, and they are often (but not always) sorted in lexicographic order.

See also

The pure Python version of this class metatensor.Labels, and the differences between TorchScript and Python API for metatensor.

>>> from metatensor.torch import Labels
>>> labels = Labels(
...     names=["structure", "atom", "species_center"],
...     values=torch.tensor([(0, 1, 8), (0, 2, 1), (0, 5, 1)]),
... )
>>> print(labels)
Labels(
    structure  atom  species_center
        0       1          8
        0       2          1
        0       5          1
)
>>> labels.names
['structure', 'atom', 'species_center']
>>> labels.values
tensor([[0, 1, 8],
        [0, 2, 1],
        [0, 5, 1]], dtype=torch.int32)

It is possible to create a view inside a Labels, selecting only a subset of columns/dimensions:

>>> # single dimension
>>> view = labels.view("atom")
>>> view.names
['atom']
>>> view.values
tensor([[1],
        [2],
        [5]], dtype=torch.int32)
>>> # multiple dimensions
>>> view = labels.view(["atom", "structure"])
>>> view.names
['atom', 'structure']
>>> view.values
tensor([[1, 0],
        [2, 0],
        [5, 0]], dtype=torch.int32)
>>> view.is_view()
True
>>> # we can convert a view back to a full, owned Labels
>>> owned_labels = view.to_owned()
>>> owned_labels.is_view()
False

One can also iterate over labels entries, or directly index the Labels to get a specific entry

>>> entry = labels[0]  # or labels.entry(0)
>>> entry.names
['structure', 'atom', 'species_center']
>>> entry.values
tensor([0, 1, 8], dtype=torch.int32)
>>> for entry in labels:
...     print(entry)
...
LabelsEntry(structure=0, atom=1, species_center=8)
LabelsEntry(structure=0, atom=2, species_center=1)
LabelsEntry(structure=0, atom=5, species_center=1)

Or get all the values associated with a given dimension/column name

>>> labels.column("atom")
tensor([1, 2, 5], dtype=torch.int32)
>>> labels["atom"]  # alternative syntax for the above
tensor([1, 2, 5], dtype=torch.int32)

Labels can be checked for equality:

>>> owned_labels == labels
False
>>> labels == labels
True

Finally, it is possible to check if a value is inside (non-view) labels, and get the corresponding position:

>>> labels.position([0, 2, 1])
1
>>> print(labels.position([0, 2, 4]))
None
>>> (0, 2, 4) in labels
False
>>> labels[2] in labels
True
Parameters:
  • names (str | List[str] | Tuple[str, ...]) – names of the dimensions in the new labels. A single string is transformed into a list with one element, i.e. names="a" is the same as names=["a"].

  • values (Tensor) – values of the labels, this needs to be a 2-dimensional array of integers.

property names: List[str]#

names of the dimensions for these Labels

property values: Tensor#

Values associated with each dimensions of the Labels, stored as 2-dimensional tensor of 32-bit integers.

Warning

The values should be treated as immutable/read-only (we would like to enforce this automatically, but PyTorch can not mark a torch.Tensor as immutable)

Any modification to this tensor can break the underlying data structure, or make it out of sync with the values.

static single() Labels[source]#

Create Labels to use when there is no relevant metadata and only one entry in the corresponding dimension (e.g. keys when a tensor map contains a single block).

Return type:

Labels

static empty(names: str | List[str] | Tuple[str, ...]) Labels[source]#

Create Labels with given names but no values.

Parameters:

names (str | List[str] | Tuple[str, ...]) – names of the dimensions in the new labels. A single string is transformed into a list with one element, i.e. names="a" is the same as names=["a"].

Return type:

Labels

static range(name: str, end: int) Labels[source]#

Create Labels with a single dimension using the given name and values in the [0, end) range.

Parameters:
  • name (str) – name of the single dimension in the new labels.

  • end (int) – end of the range for labels

Return type:

Labels

>>> from metatensor.torch import Labels
>>> labels = Labels.range("dummy", 7)
>>> labels.names
['dummy']
>>> labels.values
tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5],
        [6]], dtype=torch.int32)
__len__() int[source]#

number of entries in these labels

Return type:

int

__getitem__(dimension: str) Tensor[source]#
__getitem__(index: int) LabelsEntry

When indexing with a string, get the values for the corresponding dimension as a 1-dimensional array (i.e. Labels.column()).

When indexing with an integer, get the corresponding row/labels entry (i.e. Labels.entry()).

See also Labels.view() to extract the values associated with multiple columns/dimensions.

__contains__(entry: LabelsEntry | Tensor | List[int] | Tuple[int, ...]) bool[source]#

check if these Labels contain the given entry

Parameters:

entry (LabelsEntry | Tensor | List[int] | Tuple[int, ...]) –

Return type:

bool

__eq__(other: Labels) bool[source]#

check if two set of labels are equal (same dimension names and same values)

Parameters:

other (Labels) –

Return type:

bool

__ne__(other: Labels) bool[source]#

check if two set of labels are not equal (different dimension names or different values)

Parameters:

other (Labels) –

Return type:

bool

static load(path: str) Labels[source]#

Load a serialized Labels from the file at path, this is equivalent to metatensor.torch.load_labels().

Parameters:

path (str) – Path of the file containing a saved TensorMap

Return type:

Labels

Warning

PyTorch can execute static functions (like this one) coming from a TorchScript extension, but fails when trying to save code calling this function with torch.jit.save(), giving the following error:

Failed to downcast a Function to a GraphFunction

This issue is reported as PyTorch#115639. In the mean time, you should use metatensor.torch.load_labels() instead of this function to save your code to TorchScript.

static load_buffer(buffer: Tensor) Labels[source]#

Load a serialized Labels from an in-memory buffer, this is equivalent to metatensor.torch.load_labels_buffer().

Parameters:

buffer (Tensor) – torch Tensor representing an in-memory buffer

Return type:

Labels

Warning

PyTorch can execute static functions (like this one) coming from a TorchScript extension, but fails when trying to save code calling this function with torch.jit.save(), giving the following error:

Failed to downcast a Function to a GraphFunction

This issue is reported as PyTorch#115639. In the mean time, you should use metatensor.torch.load_labels_buffer() instead of this function to save your code to TorchScript.

save(path: str)[source]#

Save these Labels to a file, this is equivalent to metatensor.torch.save().

Parameters:

path (str) – Path of the file. If the file already exists, it will be overwritten

save_buffer() Tensor[source]#

Save these Labels to an in-memory buffer, this is equivalent to metatensor.torch.save_buffer().

Return type:

Tensor

append(name: str, values: Tensor) Labels[source]#

Append a new dimension to the end of the Labels.

Parameters:
  • name (str) – name of the new dimension

  • values (Tensor) – 1D array of values for the new dimension

Return type:

Labels

>>> import torch
>>> from metatensor.torch import Labels
>>> label = Labels("foo", torch.tensor([[42]]))
>>> print(label)
Labels(
    foo
    42
)
>>> print(label.append(name="bar", values=torch.tensor([10])))
Labels(
    foo  bar
    42   10
)
insert(index: int, name: str, values: Tensor) Labels[source]#

Insert a new dimension before index in the Labels.

Parameters:
  • index (int) – index before the new dimension is inserted

  • name (str) – name of the new dimension

  • values (Tensor) – 1D array of values for the new dimension

Return type:

Labels

>>> import torch
>>> from metatensor.torch import Labels
>>> label = Labels("foo", torch.tensor([[42]]))
>>> print(label)
Labels(
    foo
    42
)
>>> print(label.insert(0, name="bar", values=torch.tensor([10])))
Labels(
    bar  foo
    10   42
)
permute(dimensions_indexes: List[int]) Labels[source]#

Permute dimensions according to dimensions_indexes in the Labels.

Parameters:

dimensions_indexes (List[int]) – desired ordering of the dimensions

Raises:
  • ValueError – if length of dimensions_indexes does not match the Labels length

  • ValueError – if duplicate values are present in dimensions_indexes

Return type:

Labels

>>> import torch
>>> from metatensor.torch import Labels
>>> label = Labels(["foo", "bar", "baz"], torch.tensor([[42, 10, 3]]))
>>> print(label)
Labels(
    foo  bar  baz
    42   10    3
)
>>> print(label.permute([2, 0, 1]))
Labels(
    baz  foo  bar
     3   42   10
)
remove(name: str) Labels[source]#

Remove name from the dimensions of the Labels.

Removal can only be performed if the resulting Labels instance will be unique.

Parameters:

name (str) – name to be removed

Raises:

ValueError – if the name is not present.

Return type:

Labels

>>> import torch
>>> from metatensor.torch import Labels
>>> label = Labels(["foo", "bar"], torch.tensor([[42, 10]]))
>>> print(label)
Labels(
    foo  bar
    42   10
)
>>> print(label.remove(name="bar"))
Labels(
    foo
    42
)

If the new Labels is not unique an error is raised.

>>> label = Labels(["foo", "bar"], torch.tensor([[42, 10], [42, 11]]))
>>> print(label)
Labels(
    foo  bar
    42   10
    42   11
)
>>> try:
...     label.remove(name="bar")
... except RuntimeError as e:
...     print(e)
...
invalid parameter: can not have the same label value multiple time: [42] is already present at position 0
rename(old: str, new: str) Labels[source]#

Rename the old dimension to new in the Labels.

Parameters:
  • old (str) – name to be replaced

  • new (str) – name after the replacement

Raises:

ValueError – if old is not present.

Return type:

Labels

>>> import torch
>>> from metatensor.torch import Labels
>>> label = Labels("foo", torch.tensor([[42]]))
>>> print(label)
Labels(
    foo
    42
)
>>> print(label.rename("foo", "bar"))
Labels(
    bar
    42
)
to(device: str | device) Labels[source]#

move the values for these Labels to the given device

Parameters:

device (str | device) –

Return type:

Labels

position(entry: LabelsEntry | Tensor | List[int] | Tuple[int, ...]) int | None[source]#

Get the position of the given entry in this set of Labels, or None if the entry is not present in the labels.

Parameters:

entry (LabelsEntry | Tensor | List[int] | Tuple[int, ...]) –

Return type:

int | None

union(other: Labels) Labels[source]#

Take the union of these Labels with other.

If you want to know where entries in self and other ends up in the union, you can use Labels.union_and_mapping().

Parameters:

other (Labels) –

Return type:

Labels

union_and_mapping(other: Labels) Tuple[Labels, Tensor, Tensor][source]#

Take the union of these Labels with other.

This function also returns the position in the union where each entry of the input :py:class::Labels ended up.

Returns:

Tuple containing the union, a torch.Tensor containing the position in the union of the entries from self, and a torch.Tensor containing the position in the union of the entries from other.

Parameters:

other (Labels) –

Return type:

Tuple[Labels, Tensor, Tensor]

intersection(other: Labels) Labels[source]#

Take the intersection of these Labels with other.

If you want to know where entries in self and other ends up in the intersection, you can use Labels.intersection_and_mapping().

Parameters:

other (Labels) –

Return type:

Labels

intersection_and_mapping(other: Labels) Tuple[Labels, Tensor, Tensor][source]#

Take the intersection of these Labels with other.

This function also returns the position in the intersection where each entry of the input :py:class::Labels ended up.

Returns:

Tuple containing the intersection, a torch.Tensor containing the position in the intersection of the entries from self, and a torch.Tensor containing the position in the intersection of the entries from other. If entries in self or other are not used in the output, the mapping for them is set to -1.

Parameters:

other (Labels) –

Return type:

Tuple[Labels, Tensor, Tensor]

print(max_entries: int, indent: int) str[source]#

print these Labels to a string

Parameters:
  • max_entries (int) – how many entries to print, use -1 to print everything

  • indent (int) – indent the output by indent spaces

Return type:

str

entry(index: int) LabelsEntry[source]#

get a single entry in these labels, see also Labels.__getitem__()

Parameters:

index (int) –

Return type:

LabelsEntry

column(dimension: str) Tensor[source]#

Get the values associated with a single dimension in these labels (i.e. a single column of Labels.values) as a 1-dimensional array.

See also

Labels.__getitem__() as the main way to use this function

Labels.view() to access multiple columns simultaneously

Parameters:

dimension (str) –

Return type:

Tensor

view(dimensions: str | List[str] | Tuple[str, ...]) Labels[source]#

get a view for the specified columns in these labels, see also Labels.__getitem__()

Parameters:

dimensions (str | List[str] | Tuple[str, ...]) –

Return type:

Labels

is_view() bool[source]#

are these labels a view inside another set of labels?

A view is created with Labels.__getitem__() or Labels.view(), and does not implement Labels.position() or Labels.__contains__().

Return type:

bool

to_owned() Labels[source]#

convert a view to owned labels, which implement the full API

Return type:

Labels

class metatensor.torch.LabelsEntry[source]#

A single entry (i.e. row) in a set of Labels.

The main way to create a LabelsEntry is to index a Labels or iterate over them.

>>> from metatensor.torch import Labels
>>> labels = Labels(
...     names=["structure", "atom", "species_center"],
...     values=torch.tensor([(0, 1, 8), (0, 2, 1), (0, 5, 1)]),
... )
>>> entry = labels[0]  # or labels.entry(0)
>>> entry.names
['structure', 'atom', 'species_center']
>>> entry.values
tensor([0, 1, 8], dtype=torch.int32)

Warning

Due to limitations in TorchScript, LabelsEntry implementation of __hash__ will use the default Python one, returning the id() of the object. If you want to use LabelsEntry as keys in a dictionary, convert them to tuple first (tuple(entry)) — or to string (str(entry)) since TorchScript does not support tuple as dictionary keys anyway.

property names: List[str]#

names of the dimensions for this Labels entry

property values: Tensor#

Values associated with each dimensions of this LabelsEntry, stored as 32-bit integers.

Warning

The values should be treated as immutable/read-only (we would like to enforce this automatically, but PyTorch can not mark a torch.Tensor as immutable)

Any modification to this tensor can break the underlying data structure, or make it out of sync with the values.

print() str[source]#

print this entry as a named tuple (i.e. (key_1=value_1, key_2=value_2))

Return type:

str

__len__() int[source]#

number of dimensions in this labels entry

Return type:

int

__getitem__(dimension: str | int) int[source]#

get the value associated with the dimension in this entry

Parameters:

dimension (str | int) –

Return type:

int

__eq__(other: LabelsEntry) bool[source]#

check if self and other are equal (same dimensions/names and same values)

Parameters:

other (LabelsEntry) –

Return type:

bool

__ne__(other: LabelsEntry) bool[source]#

check if self and other are not equal (different dimensions/names or different values)

Parameters:

other (LabelsEntry) –

Return type:

bool