Key points: - Statistical methods - Likelihood ratio test - Maximum likelihood estimation - Likelihood ================================================================================ pattern=statistical_decision_theory(data) ================================================================================ decision_function,decision_region,decision_boundary=likelihood_ratio_test() ================================================================================ estimated_prob_density_function=maximum_likelihood_estimation(data) ================================================================================ Likelihood: how much similar? LRT: test "similarity" by using "ratio" ================================================================================ * LRT: * You suppose you know probability density function of data * Now, you want to classify that data based on probability density function ================================================================================ * Classification example * You have data composed of weight values and height values * By using that data, create classifier model, * and classify given feature vector x composed of height and weight ================================================================================ * Method1 * Code posterior_prob_val_of_class1=posterior_prob_of_class1(feature_vec1) posterior_prob_val_of_class2=posterior_prob_of_class2(feature_vec1) posterior_prob_list=[posterior_prob_val_of_class1,posterior_prob_val_of_class2] idx_of_max=max(posterior_prob_list) selected_class=posterior_prob_list[idx_of_max] ================================================================================ * Method1 * Math Posterior prob = prior prob $$$\times$$$ likelihood $$$P(\omega_i|x) = P(\omega_i) \times P(x|\omega_i)$$$ * $$$P(\omega_i) \approx \frac{N_i}{N}$$$ $$$N$$$: number of samples $$$N_i$$$: number of samples resided in class $$$\omega_i$$$ If $$$N$$$ is enough bigger, $$$\frac{N_i}{N}$$$ becomes more similar to $$$P(\omega_i)$$$ * If P(\omega_1|x) > P(\omega_2|x), choose class $$$\omega_1$$$ else, choose class $$$\omega_2$$$ * Math notation for above sentence * Red square is likelihood ratio * You can classify class of feature vector based on likelihood ratio and prior probability ratio ================================================================================ * Example * 2 classes: $$$\omega_1$$$, $$$\omega_2$$$ * probability density function 1. Probability distribution of $$$\omega_1$$$ occuring $$$P(x|\omega_1)= \frac{1}{\sqrt{2\pi}} e^{\frac{1}{2} (x-4)^2}$$$ $$$N(\mu=4,\sigma=1)$$$ 2. Probability distribution of $$$\omega_2$$$ occuring $$$P(x|\omega_2)= \frac{1}{\sqrt{2\pi}} e^{\frac{1}{2} (x-10)^2}$$$ $$$N(\mu=10,\sigma=1)$$$ * Supposed condition: same prior probability $$$P(\omega_1)=P(\omega_2)$$$ ================================================================================ * When you're given data of 6 * its class should be classified to $$$\omega_1$$$ ================================================================================ * Decision boundary: center vertical line, x=7 * It's "finding a classification boundary via LRT" ================================================================================ * Since you had supposed that $$$P(\omega_1)=P(\omega_2)$$$ * $$$\dfrac{\text{likelihood of class\;}$$$\omega_1$$$}{\text{likelihood of class\;}$$$\omega_2$$$}$$$ ================================================================================ * Above equation becomes simplified into ================================================================================ * Left fraction term can be simplified to ================================================================================ * You apply log ================================================================================ * Finally, you get ================================================================================