.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "examples/core/5-fill-value.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_examples_core_5-fill-value.py: .. _core-tutorial-fill-value: Controlling missing data with ``fill_value`` ============================================ When merging blocks with :py:meth:`TensorMap.keys_to_properties` or :py:meth:`TensorMap.keys_to_samples`, some entries in the merged array may not exist in the original blocks. By default these entries are filled with zero, but the ``fill_value`` parameter lets you choose a different value -- such as NaN -- to distinguish genuine zeros from missing data. This tutorial produces a merged TensorMap where missing entries are NaN, then shows how to build a mask that identifies them. .. py:currentmodule:: metatensor .. GENERATED FROM PYTHON SOURCE LINES 20-27 Setup ----- We build a small TensorMap with two blocks that have partially overlapping samples. Block 0 (species=1) has samples for atoms 0 and 2, while block 1 (species=6) has samples for atoms 1 and 2. Atom 2 appears in both blocks; atoms 0 and 1 each appear in only one. .. GENERATED FROM PYTHON SOURCE LINES 28-55 .. code-block:: Python import numpy as np import metatensor keys = metatensor.Labels(["species"], np.array([[1], [6]])) block_H = metatensor.TensorBlock( # atom 0: values [1.0, 2.0], atom 2: values [3.0, 4.0] values=np.array([[1.0, 2.0], [3.0, 4.0]]), samples=metatensor.Labels(["atom"], np.array([[0], [2]])), components=[], properties=metatensor.Labels(["n"], np.array([[0], [1]])), ) block_C = metatensor.TensorBlock( # atom 1: values [5.0, 6.0], atom 2: values [7.0, 8.0] values=np.array([[5.0, 6.0], [7.0, 8.0]]), samples=metatensor.Labels(["atom"], np.array([[1], [2]])), components=[], properties=metatensor.Labels(["n"], np.array([[0], [1]])), ) tensor = metatensor.TensorMap(keys, [block_H, block_C]) print(tensor) .. rst-class:: sphx-glr-script-out .. code-block:: none TensorMap with 2 blocks keys: species 1 6 .. GENERATED FROM PYTHON SOURCE LINES 56-61 Merge with default fill (zero) ------------------------------- Merging along properties with the default ``fill_value=0.0`` fills missing entries with zero. Atom 0 has no data for species=6, so those columns are 0: .. GENERATED FROM PYTHON SOURCE LINES 62-69 .. code-block:: Python merged_zero = tensor.keys_to_properties("species") print(merged_zero.block().values) # atom 0: [1.0, 2.0, 0.0, 0.0] -- species=6 columns are zero # atom 1: [0.0, 0.0, 5.0, 6.0] -- species=1 columns are zero # atom 2: [3.0, 4.0, 7.0, 8.0] -- present in both blocks .. rst-class:: sphx-glr-script-out .. code-block:: none [[1. 2. 0. 0.] [0. 0. 5. 6.] [3. 4. 7. 8.]] .. GENERATED FROM PYTHON SOURCE LINES 70-72 The zeros for atoms 0 and 1 look the same as a genuine zero value. If the data could legitimately be zero, there is no way to tell the difference. .. GENERATED FROM PYTHON SOURCE LINES 75-79 Merge with NaN fill -------------------- Setting ``fill_value=float("nan")`` marks missing entries with NaN instead: .. GENERATED FROM PYTHON SOURCE LINES 80-87 .. code-block:: Python merged_nan = tensor.keys_to_properties("species", fill_value=float("nan")) print(merged_nan.block().values) # atom 0: [1.0, 2.0, nan, nan] # atom 1: [nan, nan, 5.0, 6.0] # atom 2: [3.0, 4.0, 7.0, 8.0] .. rst-class:: sphx-glr-script-out .. code-block:: none [[ 1. 2. nan nan] [nan nan 5. 6.] [ 3. 4. 7. 8.]] .. GENERATED FROM PYTHON SOURCE LINES 88-92 Build a missing-data mask -------------------------- With NaN fill, ``np.isnan`` identifies exactly which entries were missing: .. GENERATED FROM PYTHON SOURCE LINES 93-102 .. code-block:: Python values = merged_nan.block().values missing = np.isnan(values) print("Missing-data mask:") print(missing) # [[False False True True] # [ True True False False] # [False False False False]] .. rst-class:: sphx-glr-script-out .. code-block:: none Missing-data mask: [[False False True True] [ True True False False] [False False False False]] .. GENERATED FROM PYTHON SOURCE LINES 103-110 This is useful for downstream code that needs to handle missing data explicitly, for example by masking losses during training. Note that the fill_value also applies to gradient blocks: if blocks have gradients, the gradient arrays for missing entries will also be filled with the specified value (e.g. NaN). This ensures consistent missing-data semantics across both values and gradients. .. GENERATED FROM PYTHON SOURCE LINES 113-116 The same ``fill_value`` parameter is available on :py:meth:`TensorMap.keys_to_samples`. In this example, all samples are disjoint across species, so no entries are missing and no NaN appears: .. GENERATED FROM PYTHON SOURCE LINES 117-124 .. code-block:: Python merged_samples = tensor.keys_to_samples("species", fill_value=float("nan")) print(merged_samples.block().values) # [[1. 2.] # [5. 6.] # [3. 4.] # [7. 8.]] .. rst-class:: sphx-glr-script-out .. code-block:: none [[1. 2.] [5. 6.] [3. 4.] [7. 8.]] .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.004 seconds) .. _sphx_glr_download_examples_core_5-fill-value.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: 5-fill-value.ipynb <5-fill-value.ipynb>` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: 5-fill-value.py <5-fill-value.py>` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: 5-fill-value.zip <5-fill-value.zip>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_