This is personal study note Copyright and original reference: https://www.youtube.com/watch?v=k6cj3yBXm1k&list=PLsri7w6p16vvQCo9pmuRNY_SYoOGB6bWM&index=6 ================================================================================ Is there ratio difference between 2 populations? ================================================================================ Suppose population A has ratio A ($$$P_A$$$) Suppose population B has ratio B ($$$P_B$$$) You want to know $$$P_A-P_B$$$ To do it, can you calculate $$$P_A-P_B$$$ by using $$$\hat{P}_A-\hat{P}_B$$$? ================================================================================ Hypothesis 2 side test $$$H_0:P_A-P_B=0$$$ $$$H_1:P_A-P_B\ne 0$$$ left side test $$$H_0:P_A-P_B=0$$$ $$$H_1:P_A-P_B\lt 0$$$ right side test $$$H_0:P_A-P_B=0$$$ $$$H_1:P_A-P_B\gt 0$$$ ================================================================================ Example * Presence frequence in university A * Unit: number of people * a class (Presence is checked by smartphone), b class (Presence is checked by raising a hand) ================================================================================ $$$H_0: p_a - p_b = 0$$$ $$$H_1: p_a - p_b \ne 0$$$ ================================================================================ number of samples $$$n_a, n_b$$$ ================================================================================ expectation value $$$ E(\hat{p}_a - \hat{p}_b) = p_a-p_b$$$ ================================================================================ variance $$$\sigma^2(\hat{p}_a-\hat{p}_b) \\ = \sigma^2(\hat{p}_a) + \sigma^2(\hat{p}_b) \\ = \dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}$$$ ================================================================================ test statsitics z $$$z = \dfrac{ (\hat{p}_a - \hat{p}_b) - (p_a - p_b) }{\sqrt{\dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} - \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b} }}$$$ ================================================================================ $$$s^2(\hat{p}_a - \hat{p}_b) = \dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}$$$ $$$s = \sqrt{\dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}}$$$ ================================================================================ Confidence interval $$$(\hat{p}_a - \hat{p}_b) - z_{\frac{\alpha}{2}} \times \sqrt{\dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}} \le p_a - p_b \le (\hat{p}_a + \hat{p}_b) - z_{\frac{\alpha}{2}} \times \sqrt{\dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}}$$$ ================================================================================ Calculation a class: 30 people $$$\times$$$ 15 weeks = 450 people a class: 35 people $$$\times$$$ 15 weeks = 525 people actual presence: a class: 424 people b class: 476 people $$$\hat{p}_a = \dfrac{424}{450} = 0.942$$$ $$$\hat{p}_b = \dfrac{476}{525} = 0.907$$$ standard error $$$s(\hat{p}_a - \hat{p}_b) \\ = \sqrt{\dfrac{\hat{p}_a(1-\hat{p}_a)}{n_a} + \dfrac{\hat{p}_b(1-\hat{p}_b)}{n_b}}\\ = \sqrt{\dfrac{0.942(1-0.942)}{450} + \dfrac{0.907(1-0.907)}{525}} \\ = 0.017$$$ $$$z_{\frac{\alpha}{2}} = 1.96$$$ $$$\Rightarrow$$$ $$$\pm 1.96 \times 0.017 = \pm 0.033 $$$ $$$(0.942 - 0.907) - 0.033 \le p_a-p_b \le (0.942 - 0.907) + 0.033$$$ $$$0.002 \le p_a-p_b \le 0.068$$$ ================================================================================ Test hypothesis Rejection area $$$|\pm 1.96| \gt |z|$$$ $$$z \\ = \dfrac{(0.942-0.907) - 0}{\sqrt{\frac{0.942 \times (1-0.942)}{450} + \frac{0.907 \times (1-0.907)}{525}}} \\ = \frac{0.035}{0.017} \\ = 2.059$$$ $$$H_0$$$ is rejected