LayerNorm
LayerNorm#
Versioned name: LayerNorm-1
Category: Normalization
Detailed description: Reference.
Attributes:
keep_stats
Description: keep_stats is used to indicate whether to output mean and variance which can be later passed to backward op.
Range of values: False or True
Type: bool
Default value: True
Required: no
begin_norm_axis
Description: begin_norm_axis is used to indicate which axis to start layer normalization. The normalization is from begin_norm_axis to last dimension. Negative values means indexing from right to left. This op normalizes over the last dimension by default, e.g. C in TNC for 3D and LDNC for 4D.
Range of values: [-r, r-1] where r = rank(input)
Type: s64
Default value: -1
Required: no
use_affine
Description: when set to True, this module has learnable per-element affine parameters.
Range of values: False or True
Type: bool
Default value: True
Required: no
epsilon
Description: epsilon is a constant to improve numerical stability
Range of values: arbitrary positive f32 value
Type: f32
Default value: 1e-5
Required: no
Inputs
1:
input
- input tensor of normalization. Required.Type: T1
2:
gamma
- gamma scaling for normalized value. A 1D tensor with the same span as input’s channel axis. Required by attributeuse_affine
. Optional.Type: T2
3:
beta
- bias added to the scaled normalized value. A 1D tensor with the same span as input’s channel axis.Required by attributeuse_affine
. Optional.Type: T2
Outputs
1:
output
- The result of normalization. A tensor of the same shape with input tensor. Required.Type: T1
2:
mean
- the calculated mean along the given axis. Optional.Type: T2
3:
variance
the calculated variance along the given axis. Optional.Type: T2
Types
T1: f32, f16, bf16.
T2: f32, bf16.
Constraints: T2 can be bf16 only when T1 is bf16.