TensorMap¶

class metatensor.TensorMap(keys: Labels, blocks: Sequence[TensorBlock])[source]¶

A TensorMap is the main user-facing class of this library, and can store any kind of data used in atomistic machine learning similar to a Python dict.

A tensor map contains a list of TensorBlock, each one associated with a key. Blocks can either be accessed one by one with the TensorMap.block() function, or by iterating over the tensor map itself:

for block in tensor:
    ...

The corresponding keys can be included in the loop by using the items() method of a TensorMap():

for key, block in tensor.items():
    ...

A tensor map provides functions to move some of these keys to the samples or properties labels of the blocks, moving from a sparse representation of the data to a dense one.

Parameters:

keys (Labels) – keys associated with each block
blocks (Sequence[TensorBlock]) – set of blocks containing the actual data

copy() → TensorMap[source]¶

Get a deep copy of this TensorMap, including all the data and metadata

Return type:: TensorMap

__getitem__(selection) → TensorBlock[source]¶

This is equivalent to self.block(selection)

Return type:: TensorBlock

static load(file: str | Path | BinaryIO, use_numpy=False) → TensorMap[source]¶

Load a serialized TensorMap from a file or a buffer, calling metatensor.load().

Parameters:

file (str | Path | BinaryIO) – file path or file object to load from
use_numpy – should we use the numpy loader or metatensor’s. See metatensor.load() for more information.

Return type:

TensorMap

static load_buffer(buffer: bytes | bytearray | memoryview, use_numpy=False) → TensorMap[source]¶

Load a serialized TensorMap from a buffer, calling metatensor.io.load_buffer().

Parameters:

buffer (bytes | bytearray | memoryview) – in-memory buffer containing the data
use_numpy – should we use the numpy loader or metatensor’s. See metatensor.load() for more information.

Return type:

TensorMap

save(file: str | Path | BinaryIO, use_numpy=False)[source]¶

Save this TensorMap to a file or a buffer, calling metatensor.save().

Parameters:

file (str | Path | BinaryIO) – file path or file object to save to
use_numpy – should we use the numpy serializer or metatensor’s. See metatensor.save() for more information.

save_buffer(use_numpy=False) → memoryview[source]¶

Save this TensorMap to an in-memory buffer, calling metatensor.io.save_buffer().

Parameters:: use_numpy – should we use numpy serialization or metatensor’s. See metatensor.save() for more information.
Return type:: memoryview

property keys: Labels¶: The set of keys labeling the blocks in this tensor map.

block_by_id(index: int) → TensorBlock[source]¶

Get the block at index in this TensorMap.

Parameters:: index (int) – index of the block to retrieve
Return type:: TensorBlock

blocks_by_id(indices: Sequence[int]) → TensorBlock[source]¶

Get the blocks with the given indices in this TensorMap.

Parameters:: indices (Sequence[int]) – indices of the block to retrieve
Return type:: TensorBlock

blocks_matching(selection: Labels) → List[int][source]¶

Get a (possibly empty) list of block indexes matching the selection.

This function finds all keys in this TensorMap with the same values as selection for the dimensions/names contained in the selection; and return the corresponding indexes.

The selection should contain a single entry.

Parameters:: selection (Labels)
Return type:: List[int]

block(selection: None | int | Labels | LabelsEntry | Dict[str, int] = None, **kwargs) → TensorBlock[source]¶

Get the single block in this TensorMap matching the selection.

When selection is an int, this is equivalent to TensorMap.block_by_id().

When selection is an Labels, it should only contain a single entry, which will be used for the selection.

When selection is a Dict[str, int], it is converted into a single single LabelsEntry (the dict keys becoming the names and the dict values being joined together to form the LabelsEntry values), which is then used for the selection.

When selection is a LabelsEntry, this function finds the key in this TensorMap with the same values as selection for the dimensions/names contained in the selection; and return the corresponding indexes.

If selection is None, the selection can be passed as keyword arguments, which will be converted to a Dict[str, int].

Parameters:: selection (None | int | Labels | LabelsEntry | Dict[str, int]) – description of the block to extract
Return type:: TensorBlock

>>> from metatensor import TensorMap, TensorBlock, Labels
>>> keys = Labels(["key_1", "key_2"], np.array([[0, 0], [6, 8]]))
>>> block_1 = TensorBlock(
...     values=np.full((3, 5), 1.0),
...     samples=Labels.range("sample", 3),
...     components=[],
...     properties=Labels.range("property", 5),
... )
>>> block_2 = TensorBlock(
...     values=np.full((5, 3), 2.0),
...     samples=Labels.range("sample", 5),
...     components=[],
...     properties=Labels.range("property", 3),
... )
>>> tensor = TensorMap(keys, [block_1, block_2])
>>> # numeric index selection, this gives a block by its position
>>> block = tensor.block(0)
>>> block
TensorBlock
    samples (3): ['sample']
    components (): []
    properties (5): ['property']
    gradients: None
>>> # This is the first block
>>> print(block.values.mean())
1.0
>>> # use a single key entry (i.e. LabelsEntry) for the selection
>>> print(tensor.block(tensor.keys[0]).values.mean())
1.0
>>> # Labels with a single entry selection
>>> labels = Labels(names=["key_1", "key_2"], values=np.array([[6, 8]]))
>>> print(tensor.block(labels).values.mean())
2.0
>>> # keyword arguments selection
>>> print(tensor.block(key_1=0, key_2=0).values.mean())
1.0
>>> # dictionary selection
>>> print(tensor.block({"key_1": 6, "key_2": 8}).values.mean())
2.0

Get the blocks in this TensorMap matching the selection.

When selection is None (the default), all blocks are returned.

When selection is an int, this is equivalent to TensorMap.block_by_id(); and for a List[int] this is equivalent to TensorMap.blocks_by_id().

When selection is an Labels, it should only contain a single entry, which will be used for the selection.

When selection is a LabelsEntry, this function finds all keys in this TensorMap with the same values as selection for the dimensions/names contained in the selection; and return the corresponding blocks.

If selection is None, the selection can be passed as keyword arguments, which will be converted to a Dict[str, int].

Parameters:: selection (None | Sequence[int] | int | Labels | LabelsEntry | Dict[str, int]) – description of the blocks to extract
Return type:: List[TensorBlock]

items()[source]¶: get an iterator over (key, block) pairs in this TensorMap

keys_to_samples(keys_to_move: str | Sequence[str], *, sort_samples=True) → TensorMap[source]¶

Merge blocks along the samples axis, adding keys_to_move to the end of the samples labels dimensions.

This function will remove keys_to_move from the keys, and find all blocks with the same remaining keys values. It will then merge these blocks along the samples direction (i.e. do a vertical concatenation), adding keys_to_move to the end of the samples labels dimensions. The values taken by keys_to_move in the new samples labels will be the values of these dimensions in the merged blocks’ keys.

If keys_to_move is a set of Labels, it must be empty (keys_to_move.values.shape[0] == 0), and only the Labels.names will be used.

The order of the samples is controlled by sort_samples. If sort_samples is true, samples are re-ordered to keep them lexicographically sorted. Otherwise they are kept in the order in which they appear in the blocks.

This function is only implemented when the blocks to merge have the same properties values.

Parameters:

keys_to_move (str | Sequence[str]) – description of the keys to move
sort_samples – whether to sort the merged samples or keep them in the order in which they appear in the original blocks

Returns:

a new TensorMap with merged blocks

Return type:

TensorMap

components_to_properties(dimensions: str | Sequence[str]) → TensorMap[source]¶

Move the given dimensions from the component labels to the property labels for each block.

Parameters:: dimensions (str | Sequence[str]) – name of the component dimensions to move to the properties
Return type:: TensorMap

keys_to_properties(keys_to_move: str | Sequence[str] | Labels, *, sort_samples=True) → TensorMap[source]¶

Merge blocks along the properties direction, adding keys_to_move at the beginning of the properties labels dimensions.

This function will remove keys_to_move from the keys, and find all blocks with the same remaining keys values. Then it will merge these blocks along the properties direction (i.e. do a horizontal concatenation).

If keys_to_move is given as strings, then the new property labels will only contain entries from the existing blocks. For example, merging a block with key a=0 and properties p=1, 2 with a block with key a=2 and properties p=1, 3 will produce a block with properties a, p = (0, 1), (0, 2), (2, 1), (2, 3).

If keys_to_move is a set of Labels and it is empty (len(keys_to_move) == 0), the Labels.names will be used as if they where passed directly.

Finally, if keys_to_move is a non empty set of Labels, the new properties labels will contains all of the entries of keys_to_move (regardless of the values taken by keys_to_move.names in the merged blocks’ keys) followed by the existing properties labels. For example, using a=2, 3 in keys_to_move, blocks with properties p=1, 2 will result in a, p = (2, 1), (2, 2), (3, 1), (3, 2). If there is no values (no block/missing sample) for a given property in the merged block, then the value will be set to zero.

When using a non empty Labels for keys_to_move, the properties labels of all the merged blocks must take the same values.

The order of the samples in the merged blocks is controlled by sort_samples. If sort_samples is True, samples are re-ordered to keep them lexicographically sorted. Otherwise they are kept in the order in which they appear in the blocks.

Parameters:

keys_to_move (str | Sequence[str] | Labels) – description of the keys to move
sort_samples – whether to sort the merged samples or keep them in the order in which they appear in the original blocks

Returns:

a new TensorMap with merged blocks

Return type:

TensorMap

property sample_names: List[str]¶: names of the samples dimensions for all blocks in this TensorMap

property component_names: List[str]¶: names of the components dimensions for all blocks in this TensorMap

property property_names: List[str]¶: names of the properties dimensions for all blocks in this TensorMap

print(max_keys: int) → str[source]¶

Print this TensorMap to a string, including at most max_keys in the output.

Parameters:: max_keys (int) – how many keys to include in the output. Use -1 to include all keys.
Return type:: str

property device: str | device¶: get the device of all the arrays stored inside this TensorMap

property dtype: dtype | dtype¶: get the dtype of all the arrays stored inside this TensorMap

to(*args, **kwargs) → TensorMap[source]¶

Move the keys and all the blocks in this TensorMap to the given dtype, device and arrays backend.

Parameters:

dtype – new dtype to use for all arrays. The dtype stays the same if this is set to None.
device – new device to use for all arrays. The device stays the same if this is set to None.
arrays (Optional[str]) – new backend to use for the arrays. This can be either "numpy", "torch" or None (keeps the existing backend); and must be given as a keyword argument (arrays="numpy").
non_blocking (bool) – If this is True and the TensorMap contains "torch" arrays, the function tries to move the data asynchronously. See torch.Tensor.to() for more information.

Return type:

TensorMap

set_info(key: str, value: str)[source]¶

Set or update the info (i.e. global metadata) value associated with key for this TensorMap.

Parameters:

key (str) – key of the info
value (str) – value of the info

get_info(key: str) → str | None[source]¶

Get the info (i.e. global metadata) with the given key for this TensorMap.

Parameters:: key (str) – key of the info to retrieve
Returns:: value of the info, or None if the info does not exist
Return type:: str | None

info() → Dict[str, str][source]¶

Get all the key/value info pairs stored in this TensorMap.

Return type:: Dict[str, str]