.. SPDX-FileCopyrightText: 2020-2021 Intel Corporation .. .. SPDX-License-Identifier: CC-BY-4.0 ----------- Convolution ----------- **Versioned name**: *Convolution-1* **Category**: Convolution **Short description**: `Reference `__ **Detailed description**: `Reference `__ In this description, :math:`r` denotes the spatial rank. We describe the convolution for each sample in a batch of :math:`N` inputs; the results are combined into an output batch of size :math:`N`. The convolution is implemented as if each sample input first has :math:`p_b` zeros inserted before and :math:`p_e` zeros inserted for the channels on the spatial axes, giving a padded input size of :math:`p_b+p_e+X_I`. The kernel is stretched by a factor of `d` on each of its spatial dimensions. The last index of the stretched kernel is then :math:`d(X_K-1)` so the shape is :math:`d(X_K-1)+1`. The padded input and the dilated kernel are then ungrouped into `g` equal-sized input and kernel segments; padded input segment :math:`i` and dilated kernel segment :math:`i` are convolved. The convolution is only performed where there is complete spatial overlap between the shifted kernel and the padded input, so there will be :math:`p_b+p_e+X_I-d(X_K-1)` outputs. The output segments are then regrouped along the output channel axis. Finally, all but the results on a multiple of :math:`d` spatial axis are removed, so the output will have size: .. math:: \left\lfloor \frac{p_b+p_e+X_I-d(X_K-1)-1}{s} \right\rfloor +1 **Attributes** * *strides* * **Description**: *strides* is how much the convolution output is down-sampled to produce the output. * **Range of values**: positive s64 values. * **Type**: s64[] * **Required**: *yes* * *pads_begin* * **Description**: *pads_begin* is a number of zeros to add to the beginning of each spatial axis. * **Range of values**: non-negative s64 values. * **Type**: s64[] * **Required**: *yes* * **Note**: the attribute is ignored when *auto_pad* attribute is specified. * *pads_end* * **Description**: *pads_end* is a number of zeros to add to the end of each spatial axis. * **Range of values**: non-negative s64 values. * **Type**: s64[] * **Required**: *yes* * **Note**: the attribute is ignored when *auto_pad* attribute is specified. * *dilations* * **Description**: *dilations* denotes the amount to stretch the kernel before convolving. * **Range of values**: positive s64 values. * **Type**: s64[] * **Required**: *yes* * *auto_pad* * **Description**: *auto_pad* how the padding is calculated. Possible values: * *none (not specified)*: use explicit padding values. * *same_upper (same_lower)* the input is padded to match the output size. In case of odd padding value an extra padding is added at the end (at the beginning). * *valid* - No padding (:math:`p_b=p_e=0`). * **Type**: string * **Default value**: *none* * **Required**: *no* * **Note**: *pads_begin* and *pads_end* attributes are ignored when *auto_pad* is specified. With *same_upper* and *same_lower* the padding is chosen to make the pre-stride output spatial shape the same as the input shape. When possible, :math:`p_b=p_e`. If the total padding needed is odd, *same_upper* makes :math:`p_e=p_b+1`, *same_lower* makes :math:`p_b=p_e+1`. In either case, .. math:: p_b+p_e=d(X_I-1). * *groups* * **Description**: *groups* denotes the number of groups input channels and output channels are divided into. In_channels and out_channels must both be divisible by groups * **Range of values**: a positive s64 value. * **Type**: s64 * **Default value**: 1 * **Required**: *no* * *data_format* * **Description**: *data_format* denotes the format of the input and output data. * **Range of values**: *NXC* or *NCX* (X means HW for 2D convolution, DHW for 3D convolution) * **Type**: string * **Default value**: *NXC* * **Required**: *no* * *filter_format* * **Description**: *filter_format* denotes the format of the filter. * **Range of values**: *XIO* or *OIX* (X means HW for 2D convolution, DHW for 3D convolution) * **Type**: string * **Default value**: *XIO* * **Required**: *no* **Inputs**: * **1**: ``input`` - the input tensor. The format is specified by *data_format* attribute. **Required.** * **Type**: T * **2**: ``filter`` - convolution filter tensor. The format is specified by *filter_format*. The shape of filter is :math:`(out_channels, in_channels / groups, spatial_shape)` for OIX format or :math:`(spatial_shape, in_channels / groups, out_channels)` for XIO format. :math:`in_channels` and :math:`out_channels` must both be divisible by *groups* attribute. **Required.** * **Type**: T * **3**: ``bias`` - a 1-D tensor adds to channel dimension of input. Broadcasting is supported. **Optional.** * **Type**: T **Outputs**: * **1**: ``output`` - the output tensor. The format is specified by *data_format* attribute. * **Type**: T **Types**: * **T**: f32, f16, bf16. * **Note**: Inputs and outputs have the same data type denoted by *T*. For example, if input is f32 tensor, then all other tensors have f32 data type.