# Conventions¶

oneDNN specification relies on a set of standard naming conventions for variables. This section describes these conventions.

## Variable (Tensor) Names¶

Neural network models consist of operations of the following form:

$\dst = f(\src, \weights),$

where $$\dst$$ and $$\src$$ are activation tensors, and $$\weights$$ are learnable tensors.

The backward propagation therefore consists in computing the gradients with respect to the $$\src$$weights respectively:

$\diffsrc = \mathrm{d} f_{\src}(\diffdst, \src, \weights, \dst),$

and

$\diffweights = \mathrm{d} f_{\weights}(\diffdst, \src, \weights, \dst).$

While oneDNN uses src, dst, and weights as generic names for the activations and learnable tensors, for a specific operation there might be commonly used and widely known specific names for these tensors. For instance, the convolution operation has a learnable tensor called bias. For usability reasons, oneDNN primitives use such names in initialization and other functions.

oneDNN uses the following commonly used notations for tensors:

Name

Meaning

src

Source tensor

dst

Destination tensor

weights

Weights tensor

bias

Bias tensor (used in convolution, inner product and other primitives)

scale_shift

Scale and shift tensors (used in Batch Normalization and Layer normalization primitives)

workspace

Workspace tensor that carries additional information from the forward propagation to the backward propagation

scratchpad

Temporary tensor that is required to store the intermediate results

diff_src

Gradient tensor with respect to the source

diff_dst

Gradient tensor with respect to the destination

diff_weights

Gradient tensor with respect to the weights

diff_bias

Gradient tensor with respect to the bias

diff_scale_shift

Gradient tensor with respect to the scale and shift

*_layer

RNN layer data or weights tensors

*_iter`

RNN recurrent data or weights tensors

## RNN-Specific Notation¶

The following notations are used when describing RNN primitives.

Name

Semantics

$$\cdot$$

matrix multiply operator

$$*$$

elementwise multiplication operator

W

input weights

U

recurrent weights

$$\Box^T$$

transposition

B

bias

h

hidden state

a

intermediate value

x

input

$$\Box_t$$

timestamp index

$$\Box_l$$

layer index

activation

tanh, relu, logistic

c

cell state

$$\tilde{c}$$

candidate state

i

input gate

f

forget gate

o

output gate

u

update gate

r

reset gate