.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "examples/core/3-managing-gradients.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_examples_core_3-managing-gradients.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_examples_core_3-managing-gradients.py:


.. _core-tutorial-gradients:

Managing gradients
==================

Another big difference between metatensor and other more generic data storage formats is
its ability to store and manipulate one or more gradients of the data together with the
data itself. In this tutorial, we will see how and where gradients are stored, and what
one can do with them.

.. note::

    Metatensor supports multiple ways of managing gradients: explicit forward gradients,
    presented in this tutorial; and implicit backward mode gradients (only when the data
    is stored in :py:class:`torch.Tensor`). Both can be mixed as well, to compute
    backward mode gradients of explicit forward mode gradients when training a model
    with gradient data (forces and virial).

    In general, the explicit forward gradients presented here are mainly relevant to you
    if you are working within the numpy ecosystem; and the implicit backward gradients
    are more interesting if you are working in the PyTorch ecosystem.

    .. TODO add tutorial explaining the difference in more details and link to it here

The code used to generate :download:`spherical-expansion.npz` is in :ref:`the first
tutorial <core-tutorial-first-steps>`, and the code for :download:`radial-spectrum.npz`
is shown :ref:`in the second <core-tutorial-sparsity>`. Notice how in both cases, the
data was computed with ``gradients=["positions"]``, meaning the gradients with respect
to atomic positions are included.

.. py:currentmodule:: metatensor

.. GENERATED FROM PYTHON SOURCE LINES 38-43

.. code-block:: Python


    import metatensor
    from metatensor import TensorBlock, TensorMap


.. GENERATED FROM PYTHON SOURCE LINES 44-52

Amazing gradients and where to find them
----------------------------------------

In the first :ref:`tutorial <core-tutorial-first-steps>`, we have seen how metatensor
stores data and metadata together inside :py:class:`TensorBlock`; and groups multiple
blocks to form a full :py:class:`TensorMap`. To refresh our memory, let's load some
data (the radial spectrum from the :ref:`sparsity tutorial <core-tutorial-sparsity>`).
It is a tensor map with two dimensions for the keys; and 5 blocks:

.. GENERATED FROM PYTHON SOURCE LINES 53-58

.. code-block:: Python


    radial_spectrum = metatensor.load("radial-spectrum.npz")
    print(radial_spectrum)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TensorMap with 5 blocks
    keys: center_type  neighbor_type
               6             6
               6             8
               7             7
               8             6
               8             8


.. GENERATED FROM PYTHON SOURCE LINES 60-62

If we look at one of the block, we can see that is contains gradients with respect to
``"positions"``:

.. GENERATED FROM PYTHON SOURCE LINES 63-67

.. code-block:: Python


    block = radial_spectrum.block(center_type=7, neighbor_type=7)
    print(block)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TensorBlock
        samples (2): ['system', 'atom']
        components (): []
        properties (3): ['n']
        gradients: ['positions']


.. GENERATED FROM PYTHON SOURCE LINES 68-70

Gradients are stored inside normal :py:class:`TensorBlock`, with their own set of
metadata in the samples, components and properties dimensions.

.. GENERATED FROM PYTHON SOURCE LINES 71-75

.. code-block:: Python


    gradient = block.gradient("positions")
    print(gradient)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Gradient TensorBlock ('positions')
        samples (4): ['sample', 'system', 'atom']
        components (3): ['xyz']
        properties (3): ['n']
        gradients: None


.. GENERATED FROM PYTHON SOURCE LINES 76-85

The samples are different from the values blocks (the block to which this gradient it
attached to): there is a first ``"sample"`` dimension, followed by a pair of indices
``(structure, atom)``.

The ``"sample"`` dimension is always present in gradients, and indicate which of the
samples in the values block we are taking the gradient of. Here, the first row of the
gradients will contain a gradient of the first sample in the values; with respect to
the position of atom 4; while the last row of the gradients contains a gradient of the
second row of the values with respect to the position of atom 5.

.. GENERATED FROM PYTHON SOURCE LINES 86-89

.. code-block:: Python


    print(gradient.samples)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Labels(
        sample  system  atom
          0       0      4
          0       0      5
          1       0      4
          1       0      5
    )


.. GENERATED FROM PYTHON SOURCE LINES 90-92

Re-using the notation from the previous tutorial, the values contain :math:`\rho_i`,
for a given atomic center :math:`i`.

.. GENERATED FROM PYTHON SOURCE LINES 93-96

.. code-block:: Python


    print(block.samples)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Labels(
        system  atom
          0      4
          0      5
    )


.. GENERATED FROM PYTHON SOURCE LINES 97-121

If we look a the samples for the values, we can express the four samples in this
gradient block as

- :math:`\nabla_4 \rho_4`: gradient of the representation of atom 4 with respect to
  the position of atom 4;
- :math:`\nabla_5 \rho_4`: gradient of the representation of atom 4 with respect to
  the position of atom 5;
- :math:`\nabla_4 \rho_5`: gradient of the representation of atom 5 with respect to
  the position of atom 4;
- :math:`\nabla_5 \rho_5`: gradient of the representation of atom 5 with respect to
  the position of atom 5.

You'll realize that some of the combinations of atoms are missing here: there is no
gradient of :math:`\rho_4` with respect to the positions of atom 0, 1, 2, *etc.* This
is another instance of the data sparsity that metatensor enable: only the non-zero
gradients are actually stored in memory.

.. figure:: /../static/images/TensorBlock-Gradients.*
    :width: 400px
    :align: center

    Visual illustration of the gradients, and how multiple gradient row/gradient
    samples can correspond to the same row/sample in the values.


.. GENERATED FROM PYTHON SOURCE LINES 124-127

The gradient block can also differ from the values block in the components: here the
values have no components, but the gradient have one, representing the x/y/z cartesian
coordinate direction of the gradient with respect to positions.

.. GENERATED FROM PYTHON SOURCE LINES 128-131

.. code-block:: Python


    print(gradient.components)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [Labels(
        xyz
         0
         1
         2
    )]


.. GENERATED FROM PYTHON SOURCE LINES 132-134

Finally, the gradient properties are guaranteed to be the same as the values
properties.

.. GENERATED FROM PYTHON SOURCE LINES 135-138

.. code-block:: Python


    print(block.properties == gradient.properties)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    True


.. GENERATED FROM PYTHON SOURCE LINES 139-142

The gradient block also contains the data for the gradient, in the ``values``
attribute. Here the gradients are zeros everywhere except in the x direction because
in the original input, the N\ :sub:`2` molecule was oriented along the x axis.

.. GENERATED FROM PYTHON SOURCE LINES 143-146

.. code-block:: Python


    print(gradient.values)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [[[ 0.17209336  0.08442732 -0.09967984]
      [ 0.          0.          0.        ]
      [ 0.          0.          0.        ]]

     [[-0.17209336 -0.08442732  0.09967984]
      [ 0.          0.          0.        ]
      [ 0.          0.          0.        ]]

     [[ 0.17209336  0.08442732 -0.09967984]
      [ 0.          0.          0.        ]
      [ 0.          0.          0.        ]]

     [[-0.17209336 -0.08442732  0.09967984]
      [ 0.          0.          0.        ]
      [ 0.          0.          0.        ]]]


.. GENERATED FROM PYTHON SOURCE LINES 147-156

What if the values have components?
-----------------------------------

We have seen that the gradient samples are related to the values samples with the
``sample`` dimension; and that the gradient are allowed to have custom ``components``.
You might be wondering what happen if the values already have some components!

Let's load such an example, the spherical expansion from the :ref:`first steps
tutorial <core-tutorial-first-steps>`:

.. GENERATED FROM PYTHON SOURCE LINES 157-161

.. code-block:: Python


    spherical_expansion = metatensor.load("spherical-expansion.npz")
    print(spherical_expansion)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TensorMap with 12 blocks
    keys: o3_lambda  o3_sigma  center_type  neighbor_type
              0         1           6             6
              1         1           6             6
              2         1           6             6
              0         1           6             8
              1         1           6             8
              2         1           6             8
              0         1           8             6
              1         1           8             6
              2         1           8             6
              0         1           8             8
              1         1           8             8
              2         1           8             8


.. GENERATED FROM PYTHON SOURCE LINES 162-164

In the :py:class:`TensorMap` above, the value blocks already have a set of components
corresponding to the :math:`m` index of spherical harmonics:

.. GENERATED FROM PYTHON SOURCE LINES 165-169

.. code-block:: Python


    block = spherical_expansion.block(2)
    print(block.components)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [Labels(
        o3_mu
         -2
         -1
          0
          1
          2
    )]


.. GENERATED FROM PYTHON SOURCE LINES 170-173

If we look at the gradients with respect to positions, we see that they contain two
sets of components: the same ``xyz`` component as the radial spectrum example
earlier; and the same ``o3_mu`` as the values.

.. GENERATED FROM PYTHON SOURCE LINES 174-178

.. code-block:: Python


    gradient = block.gradient("positions")
    print(gradient)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Gradient TensorBlock ('positions')
        samples (1): ['sample', 'system', 'atom']
        components (3, 5): ['xyz', 'o3_mu']
        properties (5): ['n']
        gradients: None


.. GENERATED FROM PYTHON SOURCE LINES 180-185

.. code-block:: Python


    print("first set of components:", gradient.components[0])
    print("second set of components:", gradient.components[1])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    first set of components: Labels(
        xyz
         0
         1
         2
    )
    second set of components: Labels(
        o3_mu
         -2
         -1
           ...
          1
          2
    )


.. GENERATED FROM PYTHON SOURCE LINES 186-189

In general, the gradient blocks are allowed to have additional components when
compared to the values, but these extra components must come first, and are followed
by the same set of components as the values.

.. GENERATED FROM PYTHON SOURCE LINES 193-209

Using gradients in calculations
-------------------------------

Now that we know about gradient storage in metatensor, we should try to compute a new
set of values and their corresponding gradients.

Let's compute the square of the radial spectrum, :math:`h(r) = \rho^2(r)`, and the
corresponding gradients with respect to atomic positions. The chain rules tells us
that the gradient should be

.. math::

    \nabla h(r) = 2\ \rho(r) * \nabla \rho(r)

Since the calculation can happen block by block, let's define a function to compute a
new :math:`h(r)` block:

.. GENERATED FROM PYTHON SOURCE LINES 210-247

.. code-block:: Python


    def compute_square(block: TensorBlock) -> TensorBlock:
        # compute the new values
        new_values = block.values**2

        # store the new values in a block
        new_block = TensorBlock(
            values=new_values,
            samples=block.samples,
            components=block.components,
            properties=block.properties,
        )

        # compute the new gradient
        gradient = block.gradient("positions")

        # `block.values[gradient.samples["sample"]]` gives us an array with a shape
        # compatible with `gradient.values`; using the right row in the values to compute
        # the a given row of the gradients. ``None`` creates an additional dimension to
        # match the components of the gradients.
        broadcasted_values = block.values[gradient.samples["sample"], None, :]
        new_gradient_values = 2.0 * broadcasted_values * gradient.values

        new_gradient = TensorBlock(
            values=new_gradient_values,
            samples=gradient.samples,
            components=gradient.components,
            properties=gradient.properties,
        )

        # store the gradient in the new block
        new_block.add_gradient("positions", new_gradient)

        return new_block


.. GENERATED FROM PYTHON SOURCE LINES 248-252

One issue when applying the equation above blindly is that ``block.values`` (i.e.
:math:`\rho(r)`) and ``gradient.values`` (i.e. :math:`\nabla \rho(r)`) have different
shape. Fortunately, we already know how to match them: ``gradient.samples["sample"]``
contains the indices of ``block.values`` matching each row of ``gradient.values``.

.. GENERATED FROM PYTHON SOURCE LINES 253-257

.. code-block:: Python


    gradient = radial_spectrum.block(2).gradient("positions")
    print(gradient.samples["sample"])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    [0 0 1 1]


.. GENERATED FROM PYTHON SOURCE LINES 258-260

We can now apply this function on all the blocks, and reconstruct a new
:py:class:`TensorMap`:

.. GENERATED FROM PYTHON SOURCE LINES 261-265

.. code-block:: Python


    blocks = [compute_square(block) for block in radial_spectrum.blocks()]
    squared = TensorMap(keys=radial_spectrum.keys, blocks=blocks)


.. GENERATED FROM PYTHON SOURCE LINES 266-268

``squares`` has the same shape and sparsity pattern as ``radial_spectrum``, but
contains different values:

.. GENERATED FROM PYTHON SOURCE LINES 269-272

.. code-block:: Python


    print(squared)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    TensorMap with 5 blocks
    keys: center_type  neighbor_type
               6             6
               6             8
               7             7
               8             6
               8             8


.. GENERATED FROM PYTHON SOURCE LINES 274-281

.. code-block:: Python

    rs_block = radial_spectrum.block(2)
    squared_block = squared.block(2)

    print("radial_spectrum block:", rs_block)
    print("square block:", squared_block)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    radial_spectrum block: TensorBlock
        samples (2): ['system', 'atom']
        components (): []
        properties (3): ['n']
        gradients: ['positions']
    square block: TensorBlock
        samples (2): ['system', 'atom']
        components (): []
        properties (3): ['n']
        gradients: ['positions']


.. GENERATED FROM PYTHON SOURCE LINES 283-287

.. code-block:: Python


    print("radial_spectrum values:", rs_block.values)
    print("square values:", squared_block.values)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    radial_spectrum values: [[ 0.58644107 -0.15682146  0.03612176]
     [ 0.58644107 -0.15682146  0.03612176]]
    square values: [[0.34391313 0.02459297 0.00130478]
     [0.34391313 0.02459297 0.00130478]]


.. GENERATED FROM PYTHON SOURCE LINES 289-293

.. code-block:: Python


    print("radial_spectrum gradient:", rs_block.gradient("positions").values[:, 0])
    print("square gradient:", squared_block.gradient("positions").values[:, 0])


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    radial_spectrum gradient: [[ 0.17209336  0.08442732 -0.09967984]
     [-0.17209336 -0.08442732  0.09967984]
     [ 0.17209336  0.08442732 -0.09967984]
     [-0.17209336 -0.08442732  0.09967984]]
    square gradient: [[ 0.20184523 -0.02648003 -0.00720122]
     [-0.20184523  0.02648003  0.00720122]
     [ 0.20184523 -0.02648003 -0.00720122]
     [-0.20184523  0.02648003  0.00720122]]


.. GENERATED FROM PYTHON SOURCE LINES 294-313

.. tip::

    We provide many functions that operate on :py:class:`TensorMap` and
    :py:class:`TensorBlock` as part of the :ref:`metatensor-operations
    <metatensor-operations>` module (installed by default with the main
    ``metatensor`` package). These operations already support the different sparsity
    levels of metatensor, and support for explicit forward gradients. In general you
    will not have to write the type of code from this tutorial yourself, and you
    should use the corresponding operation.

    For example, ``squared`` from this tutorial can be calculated with:

    .. code-block:: python

        squared = metatensor.multiply(radial_spectrum, radial_spectrum)

        # alternatively
        squared = metatensor.pow(radial_spectrum, 2)


.. GENERATED FROM PYTHON SOURCE LINES 314-317

.. code-block:: Python


    squared_operations = metatensor.multiply(radial_spectrum, radial_spectrum)
    print(metatensor.equal(squared_operations, squared))


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    True


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.023 seconds)


.. _sphx_glr_download_examples_core_3-managing-gradients.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: 3-managing-gradients.ipynb <3-managing-gradients.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: 3-managing-gradients.py <3-managing-gradients.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: 3-managing-gradients.zip <3-managing-gradients.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_