Mathematical NotationsΒΆ
Notation 
Definition 

\(n\) or \(m\) 
The number of observations in a dataset. Typically \(n\) is used, but sometimes \(m\) is required to distinguish two datasets, e.g., the training set and the inference set. 
\(p\) or \(r\) 
The number of features in a dataset. Typically \(p\) is used, but sometimes \(r\) is required to distinguish two datasets. 
\(a \times b\) 
The dimensionality of a matrix (dataset) has \(a\) rows (observations) and \(b\) columns (features). 
\(A\) 
Depending on the context may be interpreted as follows:

\(\x\\) 
The \(L_2\)norm of a vector \(x \in \mathbb{R}^d\),
\[\x\ = \sqrt{ x_1^2 + x_2^2 + \dots + x_d^2 }.\]

\(\mathrm{sgn}(x)\) 
Sign function for \(x \in \mathbb{R}\),
\[\begin{split}\mathrm{sgn}(x)=\begin{cases}
1, x < 0,\\
0, x = 0,\\
1, x > 0.
\end{cases}\end{split}\]

\(x_i\) 
In the description of an algorithm, this typically denotes the \(i\)th feature vector in the training set. 
\(x'_i\) 
In the description of an algorithm, this typically denotes the \(i\)th feature vector in the inference set. 
\(y_i\) 
In the description of an algorithm, this typically denotes the \(i\)th response in the training set. 
\(y'_i\) 
In the description of an algorithm, this typically denotes the \(i\)th response that needs to be predicted by the inference algorithm given the feature vector \(x'_i\) from the inference set. 