Labels¶

class metatensor.Labels(names: str | Sequence[str], values: ndarray, assume_unique: bool = False)[source]¶

A set of labels carrying metadata associated with a TensorMap.

The metadata can be though as a list of tuples, where each value in the tuple also has an associated dimension name. In practice, the dimensions names are stored separately from the values, and the values are in a 2-dimensional array integers with the shape (n_entries, n_dimensions). Each row/entry in this array is unique, and they are often (but not always) sorted in lexicographic order.

>>> from metatensor import Labels
>>> import numpy as np
>>> labels = Labels(
...     names=["system", "atom", "center_type"],
...     values=np.array([(0, 1, 8), (0, 2, 1), (0, 5, 1)]),
... )
>>> labels
Labels(
    system  atom  center_type
      0      1         8
      0      2         1
      0      5         1
)
>>> labels.names
['system', 'atom', 'center_type']
>>> print(labels.values)
[[0 1 8]
 [0 2 1]
 [0 5 1]]

It is possible to create a view inside a Labels, selecting a subset of columns/dimensions:

>>> # single dimension
>>> view = labels.view("atom")
>>> view.names
['atom']
>>> print(view.values)
[[1]
 [2]
 [5]]
>>> # multiple dimensions
>>> view = labels.view(["atom", "system"])
>>> view.names
['atom', 'system']
>>> print(view.values)
[[1 0]
 [2 0]
 [5 0]]
>>> view.is_view()
True
>>> # we can convert a view back to a full, owned Labels
>>> owned_labels = view.to_owned()
>>> owned_labels.is_view()
False

One can also iterate over labels entries, or directly index the Labels to get a specific entry

>>> entry = labels[0]  # or labels.entry(0)
>>> entry.names
['system', 'atom', 'center_type']
>>> print(entry.values)
[0 1 8]
>>> for entry in labels:
...     print(entry)
LabelsEntry(system=0, atom=1, center_type=8)
LabelsEntry(system=0, atom=2, center_type=1)
LabelsEntry(system=0, atom=5, center_type=1)

Or get all the values associated with a given dimension/column name

>>> print(labels.column("atom"))
[1 2 5]
>>> print(labels["atom"])  # alternative syntax for the above
[1 2 5]

Labels can be checked for equality:

>>> owned_labels == labels
False
>>> labels == labels
True

Finally, it is possible to check if a value is inside (non-view) labels, and get the corresponding position:

>>> labels.position([0, 2, 1])
1
>>> print(labels.position([0, 2, 4]))
None
>>> (0, 2, 4) in labels
False
>>> labels[2] in labels
True

Parameters:

names (str | Sequence[str]) – names of the dimensions in the new labels. A single string is transformed into a list with one element, i.e. names="a" is the same as names=["a"].
values (ndarray) – values of the labels, this needs to be a 2-dimensional array of integers.
assume_unique (bool) – skip uniqueness checks inside metatensor. This should only be set to True if you can ensure that label entries are already unique, either by construction or because you checked.

static single() → Labels[source]¶

Create Labels to use when there is no relevant metadata and only one entry in the corresponding dimension (e.g. keys when a tensor map contains a single block).

Return type:: Labels

static empty(names: str | Sequence[str]) → Labels[source]¶

Create Labels with given names but no values.

Parameters:: names (str | Sequence[str]) – names of the dimensions in the new labels. A single string is transformed into a list with one element, i.e. names="a" is the same as names=["a"].
Return type:: Labels

static range(name: str, end: int) → Labels[source]¶

Create Labels with a single dimension using the given name and values in the [0, end) range.

Parameters:

name (str) – name of the single dimension in the new labels.
end (int) – end of the range for labels

Return type:

Labels

>>> from metatensor import Labels
>>> labels = Labels.range("dummy", 7)
>>> labels.names
['dummy']
>>> print(labels.values)
[[0]
 [1]
 [2]
 [3]
 [4]
 [5]
 [6]]

__len__() → int[source]¶

number of entries in these labels

Return type:: int

__getitem__(dimension: str) → ndarray[source]¶

__getitem__(index: int) → LabelsEntry

When indexing with a string, get the values for the corresponding dimension as a 1-dimensional array (i.e. Labels.column()).

When indexing with an integer, get the corresponding row/labels entry (i.e. Labels.entry()).

See also Labels.view() to extract the values associated with multiple columns/dimensions.

__contains__(entry: LabelsEntry | Sequence[int]) → bool[source]¶

check if these Labels contain the given entry

Parameters:: entry (LabelsEntry | Sequence[int])
Return type:: bool

__eq__(other: Labels) → bool[source]¶

check if two set of labels are equal (same dimension names and same values)

Parameters:: other (Labels)
Return type:: bool

__ne__(other: Labels) → bool[source]¶

check if two set of labels are not equal (different dimension names or different values)

Parameters:: other (Labels)
Return type:: bool

static load(file: str | Path | BinaryIO) → Labels[source]¶

Load a serialized Labels from a file, calling metatensor.load_labels().

Parameters:: file (str | Path | BinaryIO) – file path or file object to load from
Return type:: Labels

static load_buffer(buffer: bytes | bytearray | memoryview) → Labels[source]¶

Load a serialized Labels from a buffer, calling metatensor.io.load_labels_buffer().

Parameters:: buffer (bytes | bytearray | memoryview) – in-memory buffer containing the data
Return type:: Labels

save(file: str | Path | BinaryIO)[source]¶

Save these Labels to a file, calling metatensor.save().

Parameters:: file (str | Path | BinaryIO) – file path or file object to save to

save_buffer() → memoryview[source]¶

Save these Labels to an in-memory buffer, calling metatensor.io.save_buffer().

Return type:: memoryview

property names: List[str]¶: names of the dimensions for these Labels

property values: ndarray¶: values associated with each dimensions of the Labels, stored as 2-dimensional tensor of 32-bit integers

append(name: str, values: ndarray) → Labels[source]¶

Append a new dimension to the end of the Labels.

Parameters:

name (str) – name of the new dimension
values (ndarray) – 1D array of values for the new dimension

Return type:

Labels

>>> import numpy as np
>>> from metatensor import Labels
>>> label = Labels("foo", np.array([[42]]))
>>> label
Labels(
    foo
    42
)
>>> label.append(name="bar", values=np.array([10]))
Labels(
    foo  bar
    42   10
)

insert(index: int, name: str, values: ndarray) → Labels[source]¶

Insert a new dimension before index in the Labels.

Parameters:

index (int) – index before the new dimension is inserted
name (str) – name of the new dimension
values (ndarray) – 1D array of values for the new dimension

Return type:

Labels

>>> import numpy as np
>>> from metatensor import Labels
>>> label = Labels("foo", np.array([[42]]))
>>> label
Labels(
    foo
    42
)
>>> label.insert(0, name="bar", values=np.array([10]))
Labels(
    bar  foo
    10   42
)

permute(dimensions_indexes: List[int]) → Labels[source]¶

Permute dimensions according to dimensions_indexes in the Labels.

Parameters:

dimensions_indexes (List[int]) – desired ordering of the dimensions

Raises:

ValueError – if length of dimensions_indexes does not match the Labels length
ValueError – if duplicate values are present in dimensions_indexes

Return type:

Labels

>>> import numpy as np
>>> from metatensor import Labels
>>> label = Labels(["foo", "bar", "baz"], np.array([[42, 10, 3]]))
>>> label
Labels(
    foo  bar  baz
    42   10    3
)
>>> label.permute([2, 0, 1])
Labels(
    baz  foo  bar
     3   42   10
)

remove(name: str) → Labels[source]¶

Remove name from the dimensions of the Labels.

Removal can only be performed if the resulting Labels instance will be unique.

Parameters:: name (str) – name to be removed
Raises:: ValueError – if the name is not present.
Return type:: Labels

>>> import numpy as np
>>> from metatensor import Labels
>>> label = Labels(["foo", "bar"], np.array([[42, 10]]))
>>> label
Labels(
    foo  bar
    42   10
)
>>> label.remove(name="bar")
Labels(
    foo
    42
)

If the new Labels is not unique an error is raised.

>>> from metatensor import MetatensorError
>>> label = Labels(["foo", "bar"], np.array([[42, 10], [42, 11]]))
>>> label
Labels(
    foo  bar
    42   10
    42   11
)
>>> try:
...     label.remove(name="bar")
... except MetatensorError as e:
...     print(e)
invalid parameter: can not have the same label entry multiple times: [42] is already present

rename(old: str, new: str) → Labels[source]¶

Rename the old dimension to new in the Labels.

Parameters:

old (str) – name to be replaced
new (str) – name after the replacement

Raises:

ValueError – if old is not present.

Return type:

Labels

>>> import numpy as np
>>> from metatensor import Labels
>>> label = Labels("foo", np.array([[42]]))
>>> label
Labels(
    foo
    42
)
>>> label.rename("foo", "bar")
Labels(
    bar
    42
)

to(device, non_blocking=False) → Labels[source]¶

Move the values for these Labels to the given device. non_blocking is ignored.

In the Python version of metatensor, this returns the original labels without change. This function is defined for compatibility with the TorchScript version of metatensor.

Return type:: Labels

property device: str¶

Get the device of these Labels.

This exists for compatibility with the TorchScript API, and always returns "cpu" when called.

position(entry: LabelsEntry | Sequence[int]) → int | None[source]¶

Get the position of the given entry in this set of Labels, or None if the entry is not present in the labels.

Parameters:: entry (LabelsEntry | Sequence[int])
Return type:: int | None

difference(other: Labels) → Labels[source]¶

Take the set difference of these Labels with other.

If you want to know where entries in self and other ends up in the difference, you can use Labels.difference_and_mapping().

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(
...     names=["a", "b"], values=np.array([[0, 1], [1, 3], [0, 3], [2, 2]])
... )
>>> second = Labels(
...     names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2], [2, 1]])
... )
>>> first.difference(second)
Labels(
    a  b
    0  1
    2  2
)

Parameters:: other (Labels)
Return type:: Labels

difference_and_mapping(other: Labels) → Tuple[Labels, ndarray][source]¶

Take the set difference of these Labels with other.

This function also returns the position in the difference where each entry of the input :py:class::Labels ended up.

Returns:: Tuple containing the difference, and a numpy.ndarray containing the position in the difference of the entries from self.
Parameters:: other (Labels)
Return type:: Tuple[Labels, ndarray]

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(
...     names=["a", "b"], values=np.array([[0, 1], [1, 3], [0, 3], [2, 2]])
... )
>>> second = Labels(
...     names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2], [2, 1]])
... )
>>> difference, mapping_1 = first.difference_and_mapping(second)
>>> difference
Labels(
    a  b
    0  1
    2  2
)
>>> print(mapping_1)
[ 0 -1 -1  1]

union(other: Labels) → Labels[source]¶

Take the union of these Labels with other.

If you want to know where entries in self and other ends up in the union, you can use Labels.union_and_mapping().

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(names=["a", "b"], values=np.array([[0, 1], [1, 2], [0, 3]]))
>>> second = Labels(names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2]]))
>>> first.union(second)
Labels(
    a  b
    0  1
    1  2
    0  3
    1  3
)

Parameters:: other (Labels)
Return type:: Labels

union_and_mapping(other: Labels) → Tuple[Labels, ndarray, ndarray][source]¶

Take the union of these Labels with other.

This function also returns the position in the union where each entry of the input :py:class::Labels ended up.

Returns:: Tuple containing the union, a numpy.ndarray containing the position in the union of the entries from self, and a numpy.ndarray containing the position in the union of the entries from other.
Parameters:: other (Labels)
Return type:: Tuple[Labels, ndarray, ndarray]

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(names=["a", "b"], values=np.array([[0, 1], [1, 2], [0, 3]]))
>>> second = Labels(names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2]]))
>>> union, mapping_1, mapping_2 = first.union_and_mapping(second)
>>> union
Labels(
    a  b
    0  1
    1  2
    0  3
    1  3
)
>>> print(mapping_1)
[0 1 2]
>>> print(mapping_2)
[2 3 1]

intersection(other: Labels) → Labels[source]¶

Take the intersection of these Labels with other.

If you want to know where entries in self and other ends up in the intersection, you can use Labels.intersection_and_mapping().

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(names=["a", "b"], values=np.array([[0, 1], [1, 2], [0, 3]]))
>>> second = Labels(names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2]]))
>>> first.intersection(second)
Labels(
    a  b
    1  2
    0  3
)

Parameters:: other (Labels)
Return type:: Labels

intersection_and_mapping(other: Labels) → Tuple[Labels, ndarray, ndarray][source]¶

Take the intersection of these Labels with other.

This function also returns the position in the intersection where each entry of the input Labels ended up.

Returns:: Tuple containing the intersection, a numpy.ndarray containing the position in the intersection of the entries from self, and a numpy.ndarray containing the position in the intersection of the entries from other. If entries in self or other are not used in the output, the mapping for them is set to -1.
Parameters:: other (Labels)
Return type:: Tuple[Labels, ndarray, ndarray]

>>> import numpy as np
>>> from metatensor import Labels
>>> first = Labels(names=["a", "b"], values=np.array([[0, 1], [1, 2], [0, 3]]))
>>> second = Labels(names=["a", "b"], values=np.array([[0, 3], [1, 3], [1, 2]]))
>>> intersection, mapping_1, mapping_2 = first.intersection_and_mapping(second)
>>> intersection
Labels(
    a  b
    1  2
    0  3
)
>>> print(mapping_1)
[-1  0  1]
>>> print(mapping_2)
[ 1 -1  0]

select(selection: Labels) → ndarray[source]¶

Select entries in these Labels that match the selection.

The selection’s names must be a subset of the names of these labels.

All entries in these Labels that match one of the entry in the selection for all the selection’s dimension will be picked. Any entry in the selection but not in these Labels will be ignored.

>>> import numpy as np
>>> from metatensor import Labels
>>> labels = Labels(
...     names=["a", "b"],
...     values=np.array([[0, 1], [1, 2], [0, 3], [1, 1], [2, 4]]),
... )
>>> selection = Labels(names=["a"], values=np.array([[0], [2], [5]]))
>>> print(labels.select(selection))
[0 2 4]

Parameters:: selection (Labels) – description of the entries to select
Returns:: 1-dimensional ndarray containing the integer indices of selected entries
Return type:: ndarray

print(max_entries: int, indent: int = 0) → str[source]¶

print these Labels to a string

Parameters:

max_entries (int) – how many entries to print, use -1 to print everything
indent (int) – indent the output by indent spaces

Return type:

str

entry(index: int) → LabelsEntry[source]¶

Get a single entry/row in these labels.