Data arrays

struct mts_array_t

mts_array_t manages n-dimensional arrays used as data in a block or tensor map. The array itself is opaque to this library and can come from multiple sources: Rust program, a C/C++ program, a Fortran program, Python with numpy or torch. The data does not have to live on CPU, or even on the same machine where this code is executed.

This struct contains a C-compatible manual implementation of a virtual table (vtable, i.e. trait in Rust, pure virtual class in C++); allowing manipulation of the array in an opaque way.

WARNING: all function implementations MUST be thread-safe, and can be called from multiple threads at the same time. The mts_array_t itself might be moved from one thread to another.

Public Members

void *ptr

User-provided data should be stored here, it will be passed as the first parameter to all function pointers below.

void (*destroy)(void *array)

Remove this array and free the associated memory. This function can be set to NULL if there is no memory management to do.

mts_status_t (*origin)(const void *array, mts_data_origin_t *origin)

This function needs to store the “data origin” for this array in origin. Users of mts_array_t should register a single data origin with mts_register_data_origin, and use it for all compatible arrays.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*device)(const void *array, DLDevice *device)

Query the device where this array’s data resides without exporting via DLPack.

The implementation must store the device information in *device.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*dtype)(const void *array, DLDataType *dtype)

Query the data type of this array without a full DLPack export.

The implementation must store the data type in *dtype.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*as_dlpack)(void *array, DLManagedTensorVersioned **dl_managed_tensor, DLDevice device, const int64_t *stream, DLPackVersion max_version)

Get a DLPack representation of the underlying data.

This function exports the array as a DLManagedTensorVersioned struct into *dl_managed_tensor, following the DLPack data interchange standard.

The device parameter specifies the desired DLPack device type. If this differs from the array’s current device, the implementation should attempt to make the data accessible on the requested device (e.g., by copying).

The stream parameter is a pointer to an integer representing a device-specific stream or queue. If this is NULL, the implementation should use the default stream for the specified device. If this is -1, no synchronization should be performed. Some devices have specific stream values:

  • For CUDA devices, 1 represents the legacy default stream, 2 the per-thread default stream. Any value above 2 indicates the stream number. 0 is not allowed as it could mean the same as NULL, 1 or 2.

  • For ROCm devices, 0 represents the default stream, any value above 2 indicates the stream number. 1 and 2 are not allowed.

See also the documentation of __dlpack__ for more information about streams: https://data-apis.org/array-api/latest/API_specification/generated/array_api.array.__dlpack__.html

max_version specifies the maximum DLPack API version the caller supports. The implementation should try to return a tensor compatible with this version, but this is not guaranteed, and the caller should check the returned tensor’s version.

The returned DLManagedTensorVersioned is owned by the caller, who is responsible for calling its deleter function when the tensor is no longer needed. The lifetime of the DLManagedTensorVersioned must not exceed the lifetime of the mts_array_t it was created from.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*shape)(const void *array, const uintptr_t **shape, uintptr_t *shape_count)

Get the shape of the array managed by this mts_array_t in the *shape pointer, and the number of dimension (size of the *shape array) in *shape_count. If the array is a single scalar, shape_count should be set to 0, and the shape pointer to NULL.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*reshape)(void *array, const uintptr_t *shape, uintptr_t shape_count)

Change the shape of the array managed by this mts_array_t to the given shape. shape_count must contain the number of elements in the shape array.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*swap_axes)(void *array, uintptr_t axis_1, uintptr_t axis_2)

Swap the axes axis_1 and axis_2 in this array.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*create)(const void *array, const uintptr_t *shape, uintptr_t shape_count, struct mts_array_t fill_value, struct mts_array_t *new_array)

Create a new array with the same options as the current one (data type, data location, etc.) and the requested shape; and store it in new_array. The number of elements in the shape array should be given in shape_count.

The new array should be filled with the scalar value from fill_value, which must be an mts_array_t containing a single scalar (empty shape) with the same dtype as this array. This function should call fill_value.destroy if the function pointer is not NULL when fill_value is no longer needed.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*copy)(const void *array, struct mts_array_t *new_array)

Make a copy of this array and return the new array in new_array.

The new array is expected to have the same data origin and parameters (data type, data location, etc.)

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.

mts_status_t (*move_data)(void *output, const void *input, const struct mts_data_movement_t *movements, uintptr_t movements_count)

Set entries in the output array (the current array) taking data from the input array. The output array is guaranteed to be created by calling mts_array_t::create with one of the arrays in the same block or tensor map as the input.

The movements array of size movements_count indicate where the data should be moved from input to output.

This function should copy data from input[movements[i].sample_in, ..., movements[i].properties_start_in + x] to array[movements[i].sample_out, ..., movements[i].properties_start_out + x] for i up to movements_count and x up to movements[i].properties_length. All indexes are 0-based.

This function should return MTS_SUCCESS on success, or MTS_CALLBACK_ERROR on failure. In case of failure, the implementation should call mts_set_last_error with an appropriate error message before returning.


mts_status_t mts_register_data_origin(const char *name, mts_data_origin_t *origin)

Register a new data origin with the given name. Calling this function multiple times with the same name will give the same mts_data_origin_t.

Parameters:
  • name – name of the data origin as an UTF-8 encoded NULL-terminated string

  • origin – pointer to an mts_data_origin_t where the origin will be stored

Returns:

The status code of this operation. If the status is not MTS_SUCCESS, you can use mts_last_error() to get the full error message.

mts_status_t mts_get_data_origin(mts_data_origin_t origin, char *buffer, uintptr_t buffer_size)

Get the name used to register a given data origin in the given buffer

Parameters:
  • origin – pre-registered data origin

  • buffer – buffer to be filled with the data origin name. The origin name will be written as an UTF-8 encoded, NULL-terminated string

  • buffer_size – size of the buffer

Returns:

The status code of this operation. If the status is not MTS_SUCCESS, you can use mts_last_error() to get the full error message.