Reduction over samples#

These functions allow to reduce over the sample indices of a TensorMap or TensorBlock objects, generating a new TensorMap or TensorBlock in which the values sharing the same indices for the indicated sample_names have been combined in a single entry. The functions differ by the type of reduction operation, but otherwise operate in the same way. The reduction operation loops over the samples in each block/map, and combines all those that only differ by the values of the indices associated with the names listed in the sample_names argument. One way to see these operations is that the sample indices describe the non-zero entries in a sparse array, and the reduction acts much like numpy.sum(), where sample_names plays the same role as the axis argument. Whenever gradients are present, the reduction is performed also on the gradients.

See also metatensor.sum_over_samples_block() and metatensor.sum_over_samples() for a detailed discussion with examples.

TensorMap operations#

metatensor.sum_over_samples(tensor: TensorMap, sample_names: List[str] | str) TensorMap[source]#

Sum a TensorMap, combining the samples according to sample_names.

This function creates a new TensorMap with the same keys as as the input tensor. Each TensorBlock is obtained summing the corresponding input TensorBlock over the sample_names indices, essentially calling sum_over_samples_block() over each block in tensor.

sample_names indicates over which dimensions in the samples the sum is performed. It accept either a single string or a list of the string with the sample names corresponding to the directions along which the sum is performed. A single string is equivalent to a list with a single element: sample_names = "atom" is the same as sample_names = ["atom"].

Parameters:
Returns:

a TensorMap containing the reduced values and sample labels

Return type:

TensorMap

>>> from metatensor import Labels, TensorBlock, TensorMap
>>> block = TensorBlock(
...     values=np.array(
...         [
...             [1, 2, 4],
...             [3, 5, 6],
...             [7, 8, 9],
...             [10, 11, 12],
...         ]
...     ),
...     samples=Labels(
...         ["system", "atom"],
...         np.array(
...             [
...                 [0, 0],
...                 [0, 1],
...                 [1, 0],
...                 [1, 1],
...             ]
...         ),
...     ),
...     components=[],
...     properties=Labels.range("properties", 3),
... )
>>> keys = Labels(names=["key"], values=np.array([[0]]))
>>> tensor = TensorMap(keys, [block])
>>> tensor_sum = sum_over_samples(tensor, sample_names="atom")
>>> # only 'system' is left as a sample
>>> print(tensor_sum.block(0))
TensorBlock
    samples (2): ['system']
    components (): []
    properties (3): ['properties']
    gradients: None
>>> print(tensor_sum.block(0).samples)
Labels(
    system
      0
      1
)
>>> print(tensor_sum.block(0).values)
[[ 4  7 10]
 [17 19 21]]
metatensor.mean_over_samples(tensor: TensorMap, sample_names: str | List[str]) TensorMap[source]#

Compute the mean of a TensorMap, combining the samples according to sample_names.

This function creates a new TensorMap with the same keys as as the input tensor, and each TensorBlock is obtained averaging the corresponding input TensorBlock over the sample_names indices.

sample_names indicates over which dimensions in the samples the mean is performed. It accept either a single string or a list of the string with the sample names corresponding to the directions along which the mean is performed. A single string is equivalent to a list with a single element: sample_names = "atom" is the same as sample_names = ["atom"].

For a general discussion of reduction operations and a usage example see the doc for sum_over_samples().

Parameters:
Return type:

TensorMap

metatensor.var_over_samples(tensor: TensorMap, sample_names: str | List[str]) TensorMap[source]#

Compute the variance of a TensorMap, combining the samples according to sample_names.

This function creates a new TensorMap with the same keys as as the input tensor, and each TensorBlock is obtained performing the variance of the corresponding input TensorBlock over the sample_names indices.

sample_names indicates over which dimensions in the samples the mean is performed. It accept either a single string or a list of the string with the sample names corresponding to the directions along which the mean is performed. A single string is equivalent to a list with a single element: sample_names = "atom" is the same as sample_names = ["atom"].

For a general discussion of reduction operations and a usage example see the doc for sum_over_samples().

The gradient is implemented as follow:

\[\nabla[Var(X)] = 2(E[X \nabla X] - E[X]E[\nabla X])\]
Parameters:
Return type:

TensorMap

metatensor.std_over_samples(tensor: TensorMap, sample_names: str | List[str]) TensorMap[source]#

Compute the standard deviation of a TensorMap, combining the samples according to sample_names.

This function creates a new TensorMap with the same keys as as the input tensor, and each TensorBlock is obtained performing the std deviation of the corresponding input TensorBlock over the sample_names indices.

sample_names indicates over which dimensions in the samples the mean is performed. It accept either a single string or a list of the string with the sample names corresponding to the directions along which the mean is performed. A single string is equivalent to a list with a single element: sample_names = "atom" is the same as sample_names = ["atom"].

For a general discussion of reduction operations and a usage example see the doc for sum_over_samples().

The gradient is implemented as follows:

\[\nabla[Std(X)] = 0.5(\nabla[Var(X)])/Std(X) = (E[X \nabla X] - E[X]E[\nabla X])/Std(X)\]
Parameters:
Return type:

TensorMap

TensorBlock operations#

metatensor.sum_over_samples_block(block: TensorBlock, sample_names: List[str] | str) TensorBlock[source]#

Sum a TensorBlock, combining the samples according to sample_names.

This function creates a new TensorBlock in which each sample is obtained summing over the sample_names indices, so that the resulting TensorBlock does not have those indices.

sample_names indicates over which dimensions in the samples the sum is performed. It accept either a single string or a list of the string with the sample names corresponding to the directions along which the sum is performed. A single string is equivalent to a list with a single element: sample_names = "atom" is the same as sample_names = ["atom"].

Parameters:
Returns:

a TensorBlock containing the reduced values and sample labels

Return type:

TensorBlock

>>> from metatensor import Labels, TensorBlock, TensorMap
>>> block = TensorBlock(
...     values=np.array(
...         [
...             [1, 2, 4],
...             [3, 5, 6],
...             [7, 8, 9],
...             [10, 11, 12],
...         ]
...     ),
...     samples=Labels(
...         ["system", "atom"],
...         np.array(
...             [
...                 [0, 0],
...                 [0, 1],
...                 [1, 0],
...                 [1, 1],
...             ]
...         ),
...     ),
...     components=[],
...     properties=Labels.range("properties", 3),
... )
>>> block_sum = sum_over_samples_block(block, sample_names="atom")
>>> print(block_sum.samples)
Labels(
    system
      0
      1
)
>>> print(block_sum.values)
[[ 4  7 10]
 [17 19 21]]
metatensor.mean_over_samples_block(block: TensorBlock, sample_names: List[str] | str) TensorBlock[source]#

Averages a TensorBlock, combining the samples according to sample_names.

See also sum_over_samples_block() and mean_over_samples()

Parameters:
Returns:

a TensorBlock containing the reduced values and sample labels

Return type:

TensorBlock

metatensor.var_over_samples_block(block: TensorBlock, sample_names: List[str] | str) TensorBlock[source]#

Computes the variance for a TensorBlock, combining the samples according to sample_names.

See also sum_over_samples_block() and std_over_samples()

Parameters:
Returns:

a TensorBlock containing the reduced values and sample labels

Return type:

TensorBlock

metatensor.std_over_samples_block(block: TensorBlock, sample_names: List[str] | str) TensorBlock[source]#

Computes the standard deviation for a TensorBlock, combining the samples according to sample_names.

See also sum_over_samples_block() and std_over_samples()

Parameters:
Returns:

a TensorBlock containing the reduced values and sample labels

Return type:

TensorBlock