13-03 LDA(linear discriminant analysis)
=========================================================
$$$(\widetilde{\mu}_{1} - \widetilde{\mu}_{2})^{2} = (w^{T}{\mu}_{1}-w^{T}{\mu}_{2})^{2}$$$
$$${\mu}_{1}$$$ : average of class 1 data before projection
$$$\widetilde{\mu}_{1}$$$ : average of class 1 data after projection
$$$w^{T}$$$ : tranpose of transform matrix.
You can write this :
$$$(\widetilde{\mu}_{1} - \widetilde{\mu}_{2})^{2} = w^{T}({\mu}_{1}-{\mu_{2}})({\mu_{1}}-{\mu_{2}})^{T}w$$$
$$$(\widetilde{\mu_{1}} - \widetilde{\mu_{2}})^{2} = w^{T}S_{between-class\;scatter}w$$$
You can finally find following relation :
$$$\widetilde{S}_{between-class\;scatter} = (\widetilde{\mu}_{1} - \widetilde{\mu}_{2})^{2} = w^{T}S_{between-class\;scatter}w$$$
Therefore, you can replace target function J(w) notated with values after projection with target function J(w) notated with values before projection
$$$J(w) = \frac{|\widetilde{\mu}_{1}-\widetilde{\mu}_{2}|^{2}}{\widetilde{S}_{1}^{2}-\widetilde{S}_{2}^{2}} \Rrightarrow J(w) = \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}$$$
You can achieve goal of optimization to maximize target function J(W) in respect to W by multiplying W
=========================================================
Fisher's linear discriminant equation :
$$$J(w) = \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}$$$
You will go with above Fisher's linear discriminant equation
You goal is to find transform matrix W which maximizes above target function J(w) notated by $$$S_{between-class\;scatter}$$$ and $$$S_{within-class\;scatter}$$$
Shape of J(W) is proved as up-convex function
Therefore, at point which has 0 differentiation, you can say J(W) is maximum
2018-06-08 08-53-31.png
Your job is to find W which maximizes J(W)
In course of this job, you can find special characteristic
Using this special characteristic is called LDA
You should find W satisfying following formular
$$$\frac{d}{dw} \begin{vmatrix} J(w) \end{vmatrix} = \frac{d}{dw} \begin{vmatrix} \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w} \end{vmatrix} = 0$$$
You can use following differentiation rule :
$$$\begin{pmatrix} \frac{f}{g} \end{pmatrix}' = \frac{f'g-g'f}{g^{2}}$$$
$$$\frac{d[w^{T}S_{between-class\;scatter}w]}{dw} [w^{T}S_{within-class\;scatter}w] - [w^{T}S_{between-class\;scatter}w] \frac{d[w^{T}S_{within-class\;scatter}w]}{dw} = 0$$$
You can simplify:
$$$2S_{between-class\;scatter}w[w^{T}S_{within-class\;scatter}w] - [w^{T}S_{between-class\;scatter}w] 2S_{within-class\;scatter}w = 0$$$
You divide both side by $$$w^{T}S_{within-class\;scatter}w$$$ then remove 2
$$$S_{between-class\;scatter}w [\frac{w^{T}S_{within-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}] - [\frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}] S_{w}w = 0$$$
You input :
$$$J(w) = \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}$$$
then simplify :
$$$S_{between-class\;scatter}w-J(w)S_{within-class\;scatter}w = 0$$$
$$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter}w = 0$$$
$$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter}w = J(w)w$$$
This form is equal to $$$Au = \lambda u$$$
You this question becomes job of finding eigenvalue and eigenvector,
which means you will obtain similar result with PCA
$$$A = S_{within-class\;scatter}^{-1}S_{between-class\;scatter}$$$
You can say :
eigenvector $$$u = w$$$
eigenvalue $$$\lambda = J(w)$$$
In PCA, you first find covariance matrix, then, find $$$u\Lambda u^{T}$$$
In LDA, you first find $$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter}$$$, then, find $$$u\Lambda u^{T}$$$
=========================================================
There is other LDA solution which doesn't perform eigenvalue analysis
Solution of generalized eigenvalue question $$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter}w = J(w)w$$$ is as following :
You know :
$$$J(w) = \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w}$$$
Then, you input J(w) :
$$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter}w =\begin{bmatrix} \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}w} \end{bmatrix}w$$$
You can simplify :
$$$[w^{T}S_{between-class\;scatter}w] S_{within-class\;scatter}w = w^{T}S_{within-class\;scatter}w S_{between-class\;scatter}w$$$
$$$[w^{T}S_{between-class\;scatter}w] S_{within-class\;scatter}w = w^{T}S_{within-class\;scatter}w \alpha_{1}(\mu_{1}-\mu_{2}) $$$
where,
$$$S_{between-class\;scatter}w = (\mu_{1}-\mu_{2})(\mu_{1}-\mu_{2})^{T}w$$$ is vector which has same direction with $$$(\mu_{1}-\mu_{2})$$$
$$$S_{within-class\;scatter}w = \frac{w^{T}S_{within-class\;scatter}w}{w^{T}S_{between-class\;scatter}w} \alpha_{1} (\mu_{1}-\mu_{2})$$$
$$$S_{within-class\;scatter}w = \alpha_{2}\alpha_{1}(\mu_{1}-\mu_{2})$$$
$$$w = \alpha_{2}\alpha_{1} S_{within-class\;scatter}^{-1} (\mu_{1}-\mu_{2})$$$
Since size of vector W is not matter, you can ignore it :
$$$w^{*} = arg_{w}max \begin{bmatrix} \frac{w^{T}S_{between-class\;scatter}w}{w^{T}S_{within-class\;scatter}} \end{bmatrix} = S_{within-class\;scatter}^{-1} (\mu_{1}-\mu_{2})$$$
=========================================================
Example question:
Find linear discriminant projection of following 2-D data
$$$\omega_{1} \; class : X_{1} = (x_{1}, x_{2}) = \{(4,1), (2,4), (2,3), (3,6), (4,4)\}$$$
$$$\omega_{2} \; class : X_{2} = (x_{1}, x_{2}) = \{(9,10), (6,8), (9,5), (8,7), (10,8)\}$$$
You perform dimensionality reduction by projecting data onto linear transform matrix w
After projection, you find transform matrix W which makes class of data seperated optimally
2018-06-08 09-00-11.png
You find variance and average within each class
$$$\mu_{1} = [3.00 3.60]$$$
$$$\mu_{2} = [8.40 7.60]$$$
average points are x marked
2018-06-08 09-01-30.png
$$$S_{1} = \begin{bmatrix} 0.80&-0.04 \\ -0.04&2.60 \end{bmatrix}$$$
$$$S_{2} = \begin{bmatrix} 1.84&-0.04 \\ -0.04&2.64 \end{bmatrix}$$$
$$$S_{1}$$$ : covariance matrix (scatter) of data of $$$\omega_{1}$$$ class
between-class scatter(variance) :
$$$S_{between-class\;scatter} = (\mu_{1} - \mu_{2})(\mu_{1} - \mu_{2})^{T} = \begin{bmatrix} 29.16&21.60 \\ 21.60&16.00 \end{bmatrix}$$$
within-class scatter(variance) :
$$$S_{within-class\;scatter} = S_{1} + S_{2} = \begin{bmatrix} 2.64&-0.44 \\ -0.44&5.28 \end{bmatrix}$$$
You found all matrices you need
Now, you need to perform eigenvalue analysis
In case of $$$S_{within-class\;scatter}$$$, you can use program to find its inverse matrix
You can find LDA projection as solution of generalized eigenvalue question
$$$S_{within-class\;scatter}^{-1}S_{between-class\;scatter} v = \lambda v \Rightarrow$$$
$$$|S_{within-class\;scatter}^{-1}S_{between-class\;scatter} - \lambda I | = 0 \Rightarrow$$$
$$$\begin{vmatrix} 11.89-\lambda&8.81 \\ 5.08&3.76-\lambda \end{vmatrix} = 0 \Rightarrow$$$
for eigenvalue $$$\lambda = 15.65$$$
You can input 15.65 :
$$$\begin{bmatrix} 11.89&8.81 \\ 5.08&3.76 \end{bmatrix} \begin{bmatrix} v_{1}\\v_{2} \end{bmatrix} = 15.65 \begin{bmatrix} v_{1}\\v_{2} \end{bmatrix} \Rightarrow$$$
for eigenvector $$$\begin{bmatrix} v_{1}\\v_{2} \end{bmatrix} = \begin{bmatrix} 0.91\\0.39 \end{bmatrix}$$$
Let's try to draw this vector
This is vector what you want to find