TensorMap

class metatensor.TensorMap(keys: Labels, blocks: Sequence[TensorBlock])

A TensorMap is the main user-facing class of this library, and can store any kind of data used in atomistic machine learning similar to a Python dict.

A tensor map contains a list of TensorBlock, each one associated with a key. Blocks can either be accessed one by one with the TensorMap.block() function, or by iterating over the tensor map itself:

for block in tensor:
    ...

The corresponding keys can be included in the loop by using the items() method of a TensorMap():

for key, block in tensor.items():
    ...

A tensor map provides functions to move some of these keys to the samples or properties labels of the blocks, moving from a sparse representation of the data to a dense one.

Parameters:
  • keys (Labels) – keys associated with each block

  • blocks (Sequence[TensorBlock]) – set of blocks containing the actual data

__getitem__(selection) TensorBlock[source]

This is equivalent to self.block(selection)

Return type:

TensorBlock

block(selection: None | int | Labels | LabelsEntry | Dict[str, int] = None, **kwargs) TensorBlock[source]

Get the single block in this TensorMap matching the selection.

When selection is an int, this is equivalent to TensorMap.block_by_id().

When selection is a Labels, LabelsEntry or Dict[str, int], this function finds the key in this TensorMap with the same values as selection for the dimensions/names contained in the selection (which can be a subset of the dimensions of the keys); and return the corresponding block. This performs a lookup in the keys, so it will be slower than TensorMap.block_by_id(), but it is more convenient when the position of the block is not known.

If selection is None, the selection can be passed as keyword arguments, which will be converted to a Dict[str, int].

Parameters:

selection (None | int | Labels | LabelsEntry | Dict[str, int]) – description of the block to extract

Return type:

TensorBlock

>>> from metatensor import TensorMap, TensorBlock, Labels
>>> keys = Labels(["key_1", "key_2"], np.array([[0, 0], [6, 8]]))
>>> block_1 = TensorBlock(
...     values=np.full((3, 5), 1.0),
...     samples=Labels.range("sample", 3),
...     components=[],
...     properties=Labels.range("property", 5),
... )
>>> block_2 = TensorBlock(
...     values=np.full((5, 3), 2.0),
...     samples=Labels.range("sample", 5),
...     components=[],
...     properties=Labels.range("property", 3),
... )
>>> tensor = TensorMap(keys, [block_1, block_2])
>>> # numeric index selection, this gives a block by its position
>>> block = tensor.block(0)
>>> block
TensorBlock
    samples (3): ['sample']
    components (): []
    properties (5): ['property']
    gradients: None
>>> # This is the first block
>>> print(block.values.mean())
1.0
>>> # use a single key entry (i.e. LabelsEntry) for the selection
>>> print(tensor.block(tensor.keys[0]).values.mean())
1.0
>>> # Labels with a single entry selection
>>> labels = Labels(names=["key_1", "key_2"], values=np.array([[6, 8]]))
>>> print(tensor.block(labels).values.mean())
2.0
>>> # keyword arguments selection
>>> print(tensor.block(key_1=0, key_2=0).values.mean())
1.0
>>> # dictionary selection
>>> print(tensor.block({"key_1": 6, "key_2": 8}).values.mean())
2.0
block_by_id(index: int) TensorBlock[source]

Get the block at index in this TensorMap.

Parameters:

index (int) – index of the block to retrieve

Return type:

TensorBlock

blocks(selection: None | Sequence[int] | int | Labels | LabelsEntry | Dict[str, int] = None, **kwargs) List[TensorBlock][source]

Get the blocks in this TensorMap matching the selection.

When selection is None (the default), all blocks are returned.

When selection is an int, this is equivalent to TensorMap.block_by_id(); and for a List[int] this is equivalent to TensorMap.blocks_by_id().

When selection is a Labels, LabelsEntry or Dict[str, int], this function finds the keys in this TensorMap with the same values as selection for the dimensions/names contained in the selection (which can be a subset of the dimensions of the keys); and return the corresponding blocks. This performs a lookup in the keys, so it will be slower than TensorMap.blocks_by_id(), but it is more convenient when the position of the blocks is not known.

If selection is None, the selection can be passed as keyword arguments, which will be converted to a Dict[str, int].

Parameters:

selection (None | Sequence[int] | int | Labels | LabelsEntry | Dict[str, int]) – description of the blocks to extract

Return type:

List[TensorBlock]

blocks_by_id(indices: Sequence[int]) TensorBlock[source]

Get the blocks with the given indices in this TensorMap.

Parameters:

indices (Sequence[int]) – indices of the block to retrieve

Return type:

TensorBlock

property component_names: List[str]

names of the components dimensions for all blocks in this TensorMap

components_to_properties(dimensions: str | Sequence[str]) TensorMap[source]

Move the given dimensions from the component labels to the property labels for each block.

Parameters:

dimensions (str | Sequence[str]) – name of the component dimensions to move to the properties

Return type:

TensorMap

copy() TensorMap[source]

Get a deep copy of this TensorMap, including all the data and metadata

Return type:

TensorMap

property device: str | device

get the device of all the arrays stored inside this TensorMap

property dtype: dtype | dtype

get the dtype of all the arrays stored inside this TensorMap

get_info(key: str) str | None[source]

Get the info (i.e. global metadata) with the given key for this TensorMap.

Parameters:

key (str) – key of the info to retrieve

Returns:

value of the info, or None if the info does not exist

Return type:

str | None

info() Dict[str, str][source]

Get all the key/value info pairs stored in this TensorMap.

Return type:

Dict[str, str]

items()[source]

get an iterator over (key, block) pairs in this TensorMap

property keys: Labels

The set of keys labeling the blocks in this tensor map.

keys_to_properties(keys_to_move: str | Sequence[str] | Labels, *, fill_value=0.0, sort_samples=True) TensorMap[source]

Merge blocks along the properties direction, adding keys_to_move at the beginning of the properties labels dimensions.

This function will remove keys_to_move from the keys, and find all blocks with the same remaining keys values. Then it will merge these blocks along the properties direction (i.e. do a horizontal concatenation).

If keys_to_move is given as strings, then the new property labels will only contain entries from the existing blocks. For example, merging a block with key a=0 and properties p=1, 2 with a block with key a=2 and properties p=1, 3 will produce a block with properties a, p = (0, 1), (0, 2), (2, 1), (2, 3).

If keys_to_move is a set of Labels and it is empty (len(keys_to_move) == 0), the Labels.names will be used as if they where passed directly.

Finally, if keys_to_move is a non empty set of Labels, the new properties labels will contain all of the entries of keys_to_move (regardless of the values taken by keys_to_move.names in the merged blocks’ keys) followed by the existing properties labels. For example, using a=2, 3 in keys_to_move, blocks with properties p=1, 2 will result in a, p = (2, 1), (2, 2), (3, 1), (3, 2). If there is no values (no block/missing sample) for a given property in the merged block, then the value will be set to the fill_value.

When using a non empty Labels for keys_to_move, the properties labels of all the merged blocks must take the same values.

The order of the samples in the merged blocks is controlled by sort_samples. If sort_samples is True, samples are re-ordered to keep them lexicographically sorted. Otherwise they are kept in the order in which they appear in the blocks.

Parameters:
  • keys_to_move (str | Sequence[str] | Labels) – description of the keys to move

  • fill_value – scalar value used to fill missing entries in the merged blocks. Defaults to 0.0.

  • sort_samples – whether to sort the merged samples or keep them in the order in which they appear in the original blocks

Returns:

a new TensorMap with merged blocks

Return type:

TensorMap

Note

The fill_value also applies to gradient blocks. If using NaN, gradient arrays for missing entries will also contain NaN.

keys_to_samples(keys_to_move: str | Sequence[str], *, fill_value=0.0, sort_samples=True) TensorMap[source]

Merge blocks along the samples axis, adding keys_to_move to the end of the samples labels dimensions.

This function will remove keys_to_move from the keys, and find all blocks with the same remaining keys values. It will then merge these blocks along the samples direction (i.e. do a vertical concatenation), adding keys_to_move to the end of the samples labels dimensions. The values taken by keys_to_move in the new samples labels will be the values of these dimensions in the merged blocks’ keys.

If keys_to_move is a set of Labels, it must be empty (keys_to_move.values.shape[0] == 0), and only the Labels.names will be used.

The order of the samples is controlled by sort_samples. If sort_samples is true, samples are re-ordered to keep them lexicographically sorted. Otherwise they are kept in the order in which they appear in the blocks.

If the blocks to merge have different property labels, the resulting block will have the union of all property labels, and values will be padded with the fill_value.

Parameters:
  • keys_to_move (str | Sequence[str]) – description of the keys to move

  • fill_value – scalar value used to fill missing entries in the merged blocks. Defaults to 0.0.

  • sort_samples – whether to sort the merged samples or keep them in the order in which they appear in the original blocks

Returns:

a new TensorMap with merged blocks

Return type:

TensorMap

Note

The fill_value also applies to gradient blocks. If using NaN, gradient arrays for missing entries will also contain NaN.

static load(file: str | Path | BinaryIO, use_numpy=False) TensorMap[source]

Load a serialized TensorMap from a file or a buffer, calling metatensor.load().

Parameters:
  • file (str | Path | BinaryIO) – file path or file object to load from

  • use_numpy – should we use the numpy loader or metatensor’s. See metatensor.load() for more information.

Return type:

TensorMap

static load_buffer(buffer: bytes | bytearray | memoryview, use_numpy=False) TensorMap[source]

Load a serialized TensorMap from a buffer, calling metatensor.io.load_buffer().

Parameters:
Return type:

TensorMap

print(max_keys: int) str[source]

Print this TensorMap to a string, including at most max_keys in the output.

Parameters:

max_keys (int) – how many keys to include in the output. Use -1 to include all keys.

Return type:

str

property property_names: List[str]

names of the properties dimensions for all blocks in this TensorMap

property sample_names: List[str]

names of the samples dimensions for all blocks in this TensorMap

save(file: str | Path | BinaryIO, use_numpy=False)[source]

Save this TensorMap to a file or a buffer, calling metatensor.save().

Parameters:
  • file (str | Path | BinaryIO) – file path or file object to save to

  • use_numpy – should we use the numpy serializer or metatensor’s. See metatensor.save() for more information.

save_buffer(use_numpy=False) memoryview[source]

Save this TensorMap to an in-memory buffer, calling metatensor.io.save_buffer().

Parameters:

use_numpy – should we use numpy serialization or metatensor’s. See metatensor.save() for more information.

Return type:

memoryview

set_info(key: str, value: str)[source]

Set or update the info (i.e. global metadata) value associated with key for this TensorMap.

Parameters:
  • key (str) – key of the info

  • value (str) – value of the info

to(*args, **kwargs) TensorMap[source]

Move the keys and all the blocks in this TensorMap to the given dtype, device and arrays backend.

Parameters:
  • dtype – new dtype to use for all arrays. The dtype stays the same if this is set to None.

  • device – new device to use for all arrays. The device stays the same if this is set to None.

  • arrays (Optional[str]) – new backend to use for the arrays. This can be either "numpy", "torch" or None (keeps the existing backend); and must be given as a keyword argument (arrays="numpy").

  • non_blocking (bool) – If this is True and the TensorMap contains "torch" arrays, the function tries to move the data asynchronously. See torch.Tensor.to() for more information.

Return type:

TensorMap