Memory

Memory object is container that describes and stores data. Memory objects can contain data of various data types and formats. There are two levels of abstraction:

  1. Memory descriptor – engine-agnostic logical description of data (number of dimensions, dimension sizes, and data type), and, optionally, the information about the physical format of data in memory. If this information is not known yet, a memory descriptor can be created with dnnl::memory::format_tag::any. This allows compute-intensive primitives to chose the most appropriate format for the computations. The user is then responsible for reordering their data into the new format if the formats do not match.

    A memory descriptor can be initialized either by specifying dimensions, and memory format tag or strides for each of them.

    User can query amount of memory required by a memory descriptor using the dnnl::memory::desc::get_size() function. The size of data in general cannot be computed as the product of dimensions multiplied by the size of the data type. So users are required to use this function for better code portability.

    Two memory descriptors can be compared using the equality and inequality operators. The comparison is especially useful when checking whether it is necessary to reorder data from the user’s data format to a primitive’s format.

  2. Memory object – an engine-specific object that handles the data and its description (a memory descriptor). With USM, the data handle is simply a pointer to void. The data handle can be queried using dnnl::memory::get_data_handle() and set using dnnl::memory::set_data_handle(). The underlying SYCL buffer, when used, can be queried using dnnl::memory::get_sycl_buffer() and and set using dnnl::memory::set_sycl_buffer(). A memory object can also be queried for the underlying memory descriptor and for its engine using dnnl::memory::get_desc() and dnnl::memory::get_engine().

Along with ordinary memory descriptors with all dimensions being positive, the library supports zero-volume memory descriptors with one or more dimensions set to zero. This is used to support the NumPy* convention. If a zero-volume memory is passed to a primitive, the primitive typically does not perform any computations with this memory. For example:

  • A concatenation primitive would ignore all memory object with zeroes in the concat dimension / axis.

  • A forward convolution with a source memory object with zero in the minibatch dimension would always produce a destination memory object with a zero in the minibatch dimension and perform no computations.

  • However, a forward convolution with a zero in one of the weights dimensions is ill-defined and is considered to be an error by the library because there is no clear definition on what the output values should be.

Data handle of a zero-volume memory is never accessed.

struct dnnl::memory

Memory object.

A memory object encapsulates a handle to a memory buffer allocated on a specific engine, tensor dimensions, data type, and memory format, which is the way tensor indices map to offsets in linear memory space. Memory objects are passed to primitives during execution.

Public Types

enum data_type

Data type specification.

Values:

enumerator undef

Undefined data type (used for empty memory descriptors).

enumerator f16

16-bit/half-precision floating point.

enumerator bf16

non-standard 16-bit floating point with 7-bit mantissa.

enumerator f32

32-bit/single-precision floating point.

enumerator s32

32-bit signed integer.

enumerator s8

8-bit signed integer.

enumerator u8

8-bit unsigned integer.

enum format_tag

Memory format tag specification.

Memory format tags can be further divided into two categories:

  • Domain-agnostic names, i.e. names that do not depend on the tensor usage in the specific primitive. These names use letters from a to l to denote logical dimensions and form the order in which the dimensions are laid in memory. For example, dnnl::memory::format_tag::ab is used to denote a 2D tensor where the second logical dimension (denoted as b) is the innermost, i.e. has stride = 1, and the first logical dimension (a) is laid out in memory with stride equal to the size of the second dimension. On the other hand, dnnl::memory::format_tag::ba is the transposed version of the same tensor: the outermost dimension (a) becomes the innermost one.

  • Domain-specific names, i.e. names that make sense only in the context of a certain domain, such as CNN. These names are aliases to the corresponding domain-agnostic tags and used mostly for convenience. For example, dnnl::memory::format_tag::nc is used to denote 2D CNN activations tensor memory format, where the channels dimension is the innermost one and the batch dimension is the outermost one. Moreover, dnnl::memory::format_tag::nc is an alias for dnnl::memory::format_tag::ab, because for CNN primitives the logical dimensions of activations tensors come in order: batch, channels, spatial. In other words, batch corresponds to the first logical dimension (a), and channels correspond to the second one (b).

The following domain-specific notation applies to memory format tags:

  • 'n' denotes the mini-batch dimension

  • 'c' denotes a channels dimension

  • When there are multiple channel dimensions (for example, in convolution weights tensor), 'i' and 'o' denote dimensions of input and output channels

  • 'g' denotes a groups dimension for convolution weights

  • 'd', 'h', and 'w' denote spatial depth, height, and width respectively

Values:

enumerator undef

Undefined memory format tag.

enumerator any

Placeholder memory format tag. Used to instruct the primitive to select a format automatically.

enumerator a

plain 1D tensor

enumerator ab

plain 2D tensor

enumerator ba

permuted 2D tensor

enumerator abc

plain 3D tensor

enumerator acb

permuted 3D tensor

enumerator bac

permuted 3D tensor

enumerator bca

permuted 3D tensor

enumerator cba

permuted 3D tensor

enumerator abcd

plain 4D tensor

enumerator abdc

permuted 4D tensor

enumerator acdb

permuted 4D tensor

enumerator bacd

permuted 4D tensor

enumerator bcda

permuted 4D tensor

enumerator cdba

permuted 4D tensor

enumerator dcab

permuted 4D tensor

enumerator abcde

plain 5D tensor

enumerator abdec

permuted 5D tensor

enumerator acbde

permuted 5D tensor

enumerator acdeb

permuted 5D tensor

enumerator bcdea

permuted 5D tensor

enumerator cdeba

permuted 5D tensor

enumerator decab

permuted 5D tensor

enumerator abcdef

plain 6D tensor

enumerator acbdef

plain 6D tensor

enumerator defcab

plain 6D tensor

enumerator x = a

1D tensor; an alias for dnnl::memory::format_tag::a

enumerator nc = ab

2D CNN activations tensor; an alias for dnnl::memory::format_tag::ab

enumerator cn = ba

2D CNN activations tensor; an alias for dnnl::memory::format_tag::ba

enumerator tn = ab

2D RNN statistics tensor; an alias for dnnl::memory::format_tag::ab

enumerator nt = ba

2D RNN statistics tensor; an alias for dnnl::memory::format_tag::ba

enumerator ncw = abc

3D CNN activations tensor; an alias for dnnl::memory::format_tag::abc

enumerator nwc = acb

3D CNN activations tensor; an alias for dnnl::memory::format_tag::acb

enumerator nchw = abcd

4D CNN activations tensor; an alias for dnnl::memory::format_tag::abcd

enumerator nhwc = acdb

4D CNN activations tensor; an alias for dnnl::memory::format_tag::acdb

enumerator chwn = bcda

4D CNN activations tensor; an alias for dnnl::memory::format_tag::bcda

enumerator ncdhw = abcde

5D CNN activations tensor; an alias for dnnl::memory::format_tag::abcde

enumerator ndhwc = acdeb

5D CNN activations tensor; an alias for dnnl::memory::format_tag::acdeb

enumerator oi = ab

2D CNN weights tensor; an alias for dnnl::memory::format_tag::ab

enumerator io = ba

2D CNN weights tensor; an alias for dnnl::memory::format_tag::ba

enumerator oiw = abc

3D CNN weights tensor; an alias for dnnl::memory::format_tag::abc

enumerator owi = acb

3D CNN weights tensor; an alias for dnnl::memory::format_tag::acb

enumerator wio = cba

3D CNN weights tensor; an alias for dnnl::memory::format_tag::cba

enumerator iwo = bca

3D CNN weights tensor; an alias for dnnl::memory::format_tag::bca

enumerator oihw = abcd

4D CNN weights tensor; an alias for dnnl::memory::format_tag::abcd

enumerator hwio = cdba

4D CNN weights tensor; an alias for dnnl::memory::format_tag::cdba

enumerator ohwi = acdb

4D CNN weights tensor; an alias for dnnl::memory::format_tag::acdb

enumerator ihwo = bcda

4D CNN weights tensor; an alias for dnnl::memory::format_tag::bcda

enumerator iohw = bacd

4D CNN weights tensor; an alias for dnnl::memory::format_tag::bacd

enumerator oidhw = abcde

5D CNN weights tensor; an alias for dnnl::memory::format_tag::abcde

enumerator dhwio = cdeba

5D CNN weights tensor; an alias for dnnl::memory::format_tag::cdeba

enumerator odhwi = acdeb

5D CNN weights tensor; an alias for dnnl::memory::format_tag::acdeb

enumerator idhwo = bcdea

5D CNN weights tensor; an alias for dnnl::memory::format_tag::bcdea

enumerator goiw = abcd

4D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::abcd

enumerator wigo = dcab

4D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::dcab

enumerator goihw = abcde

5D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::abcde

enumerator hwigo = decab

5D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::decab

enumerator giohw = acbde

5D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::acbde

enumerator goidhw = abcdef

6D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::abcdef

enumerator giodhw = acbdef

6D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::abcdef

enumerator dhwigo = defcab

6D CNN weights tensor with groups; an alias for dnnl::memory::format_tag::defcab

enumerator tnc = abc

3D RNN data tensor in the format (seq_length, batch, input channels).

enumerator ntc = bac

3D RNN data tensor in the format (batch, seq_length, input channels).

enumerator ldnc = abcd

4D RNN states tensor in the format (num_layers, num_directions, batch, state channels).

enumerator ldigo = abcde

5D RNN weights tensor in the format (num_layers, num_directions, input_channels, num_gates, output_channels).

  • For LSTM cells, the gates order is input, forget, candidate and output gate.

  • For GRU cells, the gates order is update, reset and output gate.

enumerator ldgoi = abdec

5D RNN weights tensor in the format (num_layers, num_directions, num_gates, output_channels, input_channels).

  • For LSTM cells, the gates order is input, forget, candidate and output gate.

  • For GRU cells, the gates order is update, reset and output gate.

enumerator ldio = abcd

4D LSTM projection tensor in the format (num_layers, num_directions, num_channels_in_hidden_state, num_channels_in_recurrent_projection).

enumerator ldoi = abdc

4D LSTM projection tensor in the format (num_layers, num_directions, num_channels_in_recurrent_projection, num_channels_in_hidden_state).

enumerator ldgo = abcd

4D RNN bias tensor in the format (num_layers, num_directions, num_gates, output_channels).

  • For LSTM cells, the gates order is input, forget, candidate and output gate.

  • For GRU cells, the gates order is update, reset and output gate.

using dim = int64_t

Integer type for representing dimension sizes and indices.

using dims = std::vector<dim>

Vector of dimensions. Implementations are free to force a limit on the vector’s length.

Public Functions

memory()

Default constructor.

Constructs an empty memory object, which can be used to indicate absence of a parameter.

memory(const desc &md, const engine &engine, void *handle)

Constructs a memory object.

Unless handle is equal to DNNL_MEMORY_NONE, the constructed memory object will have the underlying buffer set. In this case, the buffer will be initialized as if dnnl::memory::set_data_handle() had been called.

See

memory::set_data_handle()

Parameters
  • md: Memory descriptor.

  • engine: Engine to store the data on.

  • handle: Handle of the memory buffer to use as an underlying storage.

    • A pointer to the user-allocated buffer. In this case the library doesn’t own the buffer.

    • The DNNL_MEMORY_ALLOCATE special value. Instructs the library to allocate the buffer for the memory object. In this case the library owns the buffer.

    • DNNL_MEMORY_NONE to create dnnl_memory without an underlying buffer.

template<typename T, int ndims = 1>
memory(const desc &md, const engine &engine, cl::sycl::buffer<T, ndims> &buf)

Constructs a memory object from a SYCL buffer.

Parameters
  • md: Memory descriptor.

  • engine: Engine to store the data on.

  • buf: A SYCL buffer.

memory(const desc &md, const engine &engine)

Constructs a memory object.

The underlying storage for the memory will be allocated by the library.

Parameters
  • md: Memory descriptor.

  • engine: Engine to store the data on.

desc get_desc() const

Returns the associated memory descriptor.

engine get_engine() const

Returns the associated engine.

void *get_data_handle() const

Returns the underlying memory buffer.

On the CPU engine, or when using USM, this is a pointer to the allocated memory.

void set_data_handle(void *handle, const stream &stream) const

Sets the underlying memory buffer.

This function may write zero values to the memory specified by the handle if the memory object has a zero padding area. This may be time consuming and happens each time this function is called. The operation is always blocking and the stream parameter is a hint.

Note

Even when the memory object is used to hold values that stay constant during the execution of the program (pre-packed weights during inference, for example), the function will still write zeroes to the padding area if it exists. Hence, the handle parameter cannot and does not have a const qualifier.

Parameters
  • handle: Memory buffer to use as the underlying storage. On the CPU engine or when USM is used, the data handle is a pointer to the actual data. For OpenCL it is a cl_mem. It must have at least get_desc().get_size() bytes allocated.

  • stream: Stream to use to execute padding in.

void set_data_handle(void *handle) const

Sets data handle.

See documentation for dnnl::memory::set_data_handle(void *, const stream &) const for more information.

Parameters
  • handle: Memory buffer to use as the underlying storage. For the CPU engine, the data handle is a pointer to the actual data. For OpenCL it is a cl_mem. It must have at least get_desc().get_size() bytes allocated.

template<typename T, int ndims = 1>
cl::sycl::buffer<T, ndims> get_sycl_buffer(size_t *offset = nullptr) const

Returns the underlying SYCL buffer object.

Template Parameters
  • T: Type of the requested buffer.

  • ndims: Number of dimensions of the requested buffer.

Parameters
  • offset: Offset within the returned buffer at which the memory object’s data starts. Only meaningful for 1D buffers.

template<typename T, int ndims>
void set_sycl_buffer(cl::sycl::buffer<T, ndims> &buf)

Sets the underlying buffer to the given SYCL buffer.

Template Parameters
  • T: Type of the buffer.

  • ndims: Number of dimensions of the buffer.

Parameters
  • buf: SYCL buffer.

struct desc

A memory descriptor.

Public Functions

desc()

Constructs a zero (empty) memory descriptor. Such a memory descriptor can be used to indicate absence of an argument.

desc(const memory::dims &dims, data_type data_type, format_tag format_tag, bool allow_empty = false)

Constructs a memory descriptor.

Note

The logical order of dimensions corresponds to the abc... format tag, and the physical meaning of the dimensions depends both on the primitive that would operate on this memory and the operation context.

Parameters
  • dims: Tensor dimensions.

  • data_type: Data precision/type.

  • format_tag: Memory format tag.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case a zero memory descriptor will be constructed. This flag is optional and defaults to false.

desc(const memory::dims &dims, data_type data_type, const memory::dims &strides, bool allow_empty = false)

Constructs a memory descriptor by strides.

Note

The logical order of dimensions corresponds to the abc... format tag, and the physical meaning of the dimensions depends both on the primitive that would operate on this memory and the operation context.

Parameters
  • dims: Tensor dimensions.

  • data_type: Data precision/type.

  • strides: Strides for each dimension.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case a zero memory descriptor will be constructed. This flag is optional and defaults to false.

desc submemory_desc(const memory::dims &dims, const memory::dims &offsets, bool allow_empty = false) const

Constructs a memory descriptor for a region inside an area described by this memory descriptor.

Return

A memory descriptor for the region.

Parameters
  • dims: Sizes of the region.

  • offsets: Offsets to the region from the encompassing memory object in each dimension.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case a zero memory descriptor will be returned. This flag is optional and defaults to false.

desc reshape(const memory::dims &dims, bool allow_empty = false) const

Constructs a memory descriptor by reshaping an existing one. The new memory descriptor inherits the data type.

The operation ensures that the transformation of the physical memory format corresponds to the transformation of the logical dimensions. If such transformation is impossible, the function either throws an exception (default) or returns a zero memory descriptor depending on the allow_empty flag.

The reshape operation can be described as a combination of the following basic operations:

  1. Add a dimension of size 1. This is always possible.

  2. Remove a dimension of size 1. This is possible only if the dimension has no padding (i.e. padded_dims[dim] == dims[dim] && dims[dim] == 1).

  3. Split a dimension into multiple ones. This is possible only if the size of the dimension is exactly equal to the product of the split ones and the dimension does not have padding (i.e. padded_dims[dim] = dims[dim]).

  4. Joining multiple consecutive dimensions into a single one. As in the cases above, this requires that the dimensions do not have padding and that the memory format is such that in physical memory these dimensions are dense and have the same order as their logical counterparts. This also assumes that these dimensions are not blocked.

    • Here, dense means: stride for dim[i] == (stride for dim[i + 1]) * dim[i + 1];

    • And same order means: i < j if and only if stride for dim[i] < stride for dim[j].

Warning

Some combinations of physical memory layout and/or offsets or dimensions may result in a failure to make a reshape.

Return

A new memory descriptor with new dimensions.

Parameters
  • dims: New dimensions. The product of dimensions must remain constant.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case a zero memory descriptor will be returned. This flag is optional and defaults to false.

desc permute_axes(const std::vector<int> &permutation, bool allow_empty = false) const

Constructs a memory descriptor by permuting axes in an existing one.

The physical memory layout representation is adjusted accordingly to maintain the consistency between the logical and physical parts of the memory descriptor. The new memory descriptor inherits the data type.

The logical axes will be permuted in the following manner:

for (i = 0; i < ndims(); i++)
    new_desc.dims()[permutation[i]] = dims()[i];

Example:

std::vector<int> permutation = {1, 0}; // swap the first and
                                       // the second axes
dnnl::memory::desc in_md(
        {2, 3}, data_type, memory::format_tag::ab);
dnnl::memory::desc expect_out_md(
        {3, 2}, data_type, memory::format_tag::ba);

assert(in_md.permute_axes(permutation) == expect_out_md);

Return

A new memory descriptor with new dimensions.

Parameters
  • permutation: Axes permutation.

  • allow_empty: A flag signifying whether construction is allowed to fail without throwing an exception. In this case a zero memory descriptor will be returned. This flag is optional and defaults to false.

memory::dims dims() const

Returns dimensions of the memory descriptor.

Potentially expensive due to the data copy involved.

Return

A copy of the dimensions vector.

memory::data_type data_type() const

Returns the data type of the memory descriptor.

Return

The data type.

size_t get_size() const

Returns size of the memory descriptor in bytes.

Return

The number of bytes required to allocate a memory buffer for the memory object described by this memory descriptor including the padding area.

bool is_zero() const

Checks whether the memory descriptor is zero (empty).

Return

true if the memory descriptor describes an empty memory and false otherwise.

bool operator==(const desc &other) const

An equality operator.

Return

Whether this and the other memory descriptors have the same format tag, dimensions, strides, blocking, etc.

Parameters
  • other: Another memory descriptor.

bool operator!=(const desc &other) const

An inequality operator.

Return

Whether this and the other memory descriptors describe different memory.

Parameters
  • other: Another memory descriptor.