This is note I wrote as I was take following lecture http://www.kocw.net/home/search/kemView.do?kemId=1189957 Use_multi_classes_on_MAP_Bayes_criterion_considering_cost ================================================================================ $$$P[\text{correct}] \\ = \sum\limits_{i=1}^{C} P(\omega_i) \int_{R_i} P(x|\omega_i) dx \\ = \sum\limits_{i=1}^{C} \int_{R_i} P(x|\omega_i) P(\omega_i) dx \\ = \sum\limits_{i=1}^{C} \int_{R_i} P(\omega_i|x) P(x) dx$$$ $$$P[\text{correct}]$$$: probability value of classifier correctly classifies multiple classes - $$$P(x|\omega_i)$$$: likelihood, PDF - $$$\int_{R_i} P(x|\omega_i) dx$$$: probability of P(x|\omega_i) being correct - $$$P(\omega_i)$$$: prior probability value, prior probability of 170 occurring from male group and prior probability of 170 occurring from female group are different - $$$\sum\limits_{i=1}^{C}$$$: probability of class 1 is classified correctly + probability of class 2 is classified correctly + ... + probability of class C is classified correctly ================================================================================ - $$$\sum\limits_{i=1}^{C} \int_{R_i} P(x|\omega_i) P(\omega_i) dx$$$: constant term $$$P(\omega_i)$$$ is inserted into integration ================================================================================ - Use Bayes rule: (Likelihood) $$$\times$$$ (Prior probability) = (Posterior probability) $$$\times$$$ (constant term) - For a reference, 1. Joint probability: $$$P[A|B] = \frac{P[A\cap B]}{P[B]}$$$ $$$P[B]P[A|B]=P[A\cap B]=P[A]P[B|A]$$$ ================================================================================ To maximize $$$P[\text{correct}]$$$, each integration should be maximized from $$$\sum\limits_{i=1}^{C} \int_{R_i} P(\omega_i|x) P(x) dx$$$ But P(x) is constant term. ================================================================================ /home/young/Pictures/2019_04_12_12:54:58.png $$$x$$$: given feature vector. $$$P(\omega_1|x)$$$: purple PDF, when x is given, probability of $$$\omega_1$$$ occuring See far left, when you integrate PDF wrt $$$\omega_2$$$ (red line), area is maximum, So, classify feature vectors in that area into $$$\omega_2$$$ ================================================================================ - Suppose $$$\alpha_1$$$: decision which classifies feature vector into class $$$\omega_1$$$ - $$$\alpha(x) \rightarrow \{\alpha_1, \alpha_2, \cdots, \alpha_C\}$$$: when feature vector x is given, selectable decisions are $$$\{\alpha_1, \alpha_2, \cdots, \alpha_C\}$$$ ================================================================================ $$$R(\alpha(x)\rightarrow \alpha_i) \\ = R(\alpha_i|x) \\ = \sum\limits_{j=1}^{C} C_{ij} P(\omega_j|x)$$$ - $$$R(\alpha_i|x)$$$: Bayes risk when x is given and when decision is $$$\alpha_i$$$ - $$$P(\omega_j|x)$$$: posterior probability value - $$$C_{ij}$$$: cost value when i is classified into j ================================================================================ Apply above equation for all selectable decisions $$$\{\alpha_1, \alpha_2, \cdots, \alpha_C\}$$$ $$$R(\alpha(x)) = \int R(\alpha(x)|x) P(x) dx$$$ ================================================================================ /home/young/Pictures/2019_04_12_13:15:44.png Green line: risk function when classifier classifies feature vector x into class $$$\omega_3$$$ (decision $$$\alpha_3$$$) ================================================================================ Since y value is "risk", integration should be minimized ================================================================================