Metatensor#

Metatensor is a specialized data storage format for all your atomistic machine learning needs, and more. Think numpy ndarray or pytorch Tensor equipped with extra metadata for atomic — and other particles — systems. The core of this library is written in Rust and we provide API for C, C++, and Python.

The main class of metatensor is the metatensor.TensorMap data structure, illustrated below. This class defines a block-sparse data format, where each block is stored using coordinate sparse storage. The schematic below represents a TensorMap made of multiple TensorBlocks, and the overall data format is explained further in the getting started section of this documentation. If you are using metatensor from Python, we additionally provide a collection of mathematical, logical and other utility operations to make working with TensorMaps more convenient.

Why metatensor#

With the creation of metatensor, we want to achieve three goals:

provide an interchange format for the atomistic machine learning ecosystem, making different players in this ecosystem more interoperable with one another;
make it easier and faster to prototype new machine learning representations, models and algorithms applied to atomistic modeling;
run large scale simulations using machine learning interatomic potentials, with fully customizable potentials, directly defined by the researchers running the simulations.

For more information on these goals and how we are trying to fulfill them, please read the corresponding documentation page. Metatensor is still in the alpha phase of software development, so expect some rough corners and sharp edges.

Development team#

Metatensor is developed in the COSMO laboratory at EPFL, and made available to everyone under the BSD 3-clauses license. We welcome contributions from anyone, and we provide some developer documentation for newcomers.

Content: