Affine Transform

Affine Transform

The affine transform is quite simply, a piece of art. Usually when you have a linear system with 2 dimensions, the transforms are represented by a 2x2 matrix. What the affine transform does, is add another dimension to the domain, thus enabling an extra degree of freedom.

For example, a transform in 2 dimensions would be something like:

(uv)=[t11t12t21t22](xy)Or(uv)=T(xy)(1) \tag{1} \begin{pmatrix}u\\v\end{pmatrix} = \begin{bmatrix}t_{11}&t_{12}\\t_{21}&t_{22}\end{bmatrix} \cdot \begin{pmatrix}x\\y\end{pmatrix}\\[15pt] Or\\[15pt] \begin{pmatrix}u\\v\end{pmatrix} = T \cdot \begin{pmatrix}x\\y\end{pmatrix}\\

When applied to a 2D image, where and represent the row and column of pixels in the source and destination images, the above transformation represents translations of the position of the pixels from source to destination. This allows us to create effects like the following:

Transformation name Affine matrix Example
Identity (transform to original image) [1001]\begin{bmatrix}1&0\\0&1\end{bmatrix}
Reflection [1001]\begin{bmatrix}-1&0\\0&1\end{bmatrix}

These can cover all transforms of the kind
u=ax+byv=cx+dy u = ax + by v = cx + dy
However, this cannot cover translation, which has the form
u=x+m u = x + m
or the more general form
u=ax+by+m(2) \tag{2} u = ax + by + m

Enter, the affine transform. It extends the above transformation matrix such that

  1. Equation (2)(2) as well as other transformations can be supported by a single transformation matrix
  2. Multiple operations can be combined using a multiplication of the corresponding matrices. I.e. if T1T1 represents a reflection and T2T2 represents a scaling, then T2T1T2\cdot T1 represents the combination. Also, the affine transform matrix is closed under multiplication, which basically says that the matrix multiplication is composable.

It is this second property that makes the affine transform so amazing.
The affine transform extends (1)(1) as follows:
(uv1)=[t11t12t13t21t22t23001](xy1) \begin{pmatrix}u\\v\\1\end{pmatrix} = \begin{bmatrix}t_{11}&t_{12}&t_13\\t_{21}&t_{22}&t_{23}\\0&0&1\end{bmatrix} \cdot \begin{pmatrix}x\\y\\1\end{pmatrix}

Let’s take an example where we are translating every point by 4 pixels to the right. Then we translate it by 3 pixels downwards. The individual transforms are
[104010001]and[100013001] \begin{bmatrix}1&0&4\\0&1&0\\0&0&1\end{bmatrix} and \begin{bmatrix}1&0&0\\0&1&3\\0&0&1\end{bmatrix}

The combination is
Tc=[104010001][100013001]=[104013001] \begin{aligned} T_c &= \begin{bmatrix}1&0&4\\0&1&0\\0&0&1\end{bmatrix} \cdot \begin{bmatrix}1&0&0\\0&1&3\\0&0&1\end{bmatrix} \\ &= \begin{bmatrix}1&0&4\\0&1&3\\0&0&1\end{bmatrix} \end{aligned}
which is what we would expect for the combined combination. It is easy to see that this works for any combination of operations.

This section from the wikipedia article on affine transforms gives examples of multiple transformations and their affine matrices.

Comments

Popular posts from this blog