https://www.youtube.com/watch?v=Jt5BS71uVfI ================================================================================ Cross Entropy: - "optimal entropy value" - which is calculated from the information which can be incorrect - amount of uncertain information ================================================================================ Example of "information which can be incorrect" - prediction from ML model ================================================================================ Q: estimated probability P: true probability ================================================================================ See entropy and cross entropy Entropy $$$H(X) = \sum\limits_{i=1}^{n} \log_2 {\dfrac{1}{p_i}} * p_i$$$ Cross Entropy $$$H(P,Q) = \sum\limits_{i=1}^{n} \log_2{ \dfrac{1}{q_i} * p_i }$$$ Probability for true label (?) Amount of information ================================================================================ Case: prediction value has wrong probability gets "infinity value" $$$\log_2{\dfrac{1}{0.0}} * 1 + \log_2{\dfrac{1}{1.0}} * 0 + \log_2{\dfrac{1}{0.0}} * 0$$$ = infinity ================================================================================ Case: trained model can release somewhat precise prediction $$$\log_2{\dfrac{1}{0.8}} * 1 + \log_2{\dfrac{1}{0.1}} * 0 + \log_2{\dfrac{1}{0.1}} * 0$$$ = 0.32 ================================================================================ Case: trained model can release precise prediction $$$\log_2{\dfrac{1}{1.0}} * 1 + \log_2{\dfrac{1}{0.0}} * 0 + \log_2{\dfrac{1}{0.0}} * 0$$$ = 0.0 "Entropy value = Cross Entropy value" in this case ================================================================================ Cross Entropy $$$\ge$$$ Entropy Entropy: information which is derived from "true label" Cross Entropy: information which is derived from "true label" and "potentially-incorrect information" Amount of information Cross Entropy $$$\ge$$$ Entropy ================================================================================