This is notes which I wrote based on the following lectur http://www.kocw.net/home/search/kemView.do?kemId=1189957 - Population_Sample_Sampling_distribution_Mean_Variance_Covariance_Correlation_coefficient ================================================================================ - Population: is a thing like all people in the country - Sample: is like 100 people from population - sampling distribution: statistical values form distribution Sample 100 people -> statistical values like mean of height and weight Sample 100 people -> statistical values like mean of height and weight ... mean of height and weight can differ in each sample but they form a distribution ================================================================================ Statistical parameter: values which summarize characteristic of population - Mean: $$$\bar{x}=\sum\limits_{i=1}^{n} \frac{x_i}{n}$$$ Sample $$$x_1=\{3,4,5,4,3,4,5\}$$$ $$$\bar{x_1}=\frac{28}{7}=4$$$ Sample $$$x_2=\{1,2,3,4,5,6,7\}$$$ $$$\bar{x_2}=\frac{28}{7}=4$$$ Sample $$$x_3=\{4,4,4,4,4,4,4\}$$$ $$$\bar{x_3}=\frac{28}{7}=4$$$ - Variance: how much data is scattered, how much is data far from the mean $$$s^2=\sum\limits_{i=1}^{n} \frac{x_i-\bar{x}^2}{(n-1)}$$$ $$$s_1^2=\frac{(1+0+1+0+1+0+1)}{6}=\frac{4}{6}=0.67$$$ $$$s_2^2=\frac{(9+4+1+0+1+4+9)}{6}=\frac{4}{6}=4.67$$$ $$$s_3^2=\frac{(0+0+0+0+0+0+0)}{6}=\frac{4}{6}=0$$$ ================================================================================ Standard deviation: when you calculate variance, its unit changes because you use power by 2 So, you use squre root on variance to match unit. $$$s=\sqrt{s^2}=\sqrt{\sum\limits_{i=1}^{n} \frac{(x_i-\bar{x})^2}{(n-1)}}$$$ ================================================================================ Covariance: when you have over-2 variates, covariance shows pattern of change on those 2 variates. - If random data of sample is bivariate data $$$(x_i,y_i)$$$ like height and weight covariance is calculated as follow - Calculate mean values $$$\bar{x}=\sum\limits_{i=1}^{n} \frac{x_i}{n}$$$ $$$\bar{y}=\sum\limits_{i=1}^{n} \frac{y_i}{n}$$$ - Calculate covariance $$$C(x,y)=\frac{1}{n-1} \sum\limits_{i=1}^{n} (x_i-\bar{x})(y_i-\bar{y})$$$ ================================================================================ $$$x_1=\{3,4,5,4,3,4,5\}$$$ $$$\bar{x_1}=4$$$ $$$x_2=\{1,2,3,4,5,6,7\}$$$ $$$\bar{x_2}=4$$$ $$$C(x_1,x_2)=\frac{1*3-0*2-1*1+0*0-1*1+0*2+1*3}{6}=0.67$$$ ================================================================================ You should normalize covariance value and it's correlation coefficient Correlation coefficient is the value which represents correlation between 2 variates x and y $$$0 \le |\rho_{xy}| \le 1 \Leftrightarrow -1 \le \rho_{xy} \le 1 $$$ $$$\rho = \frac{C(x,y)}{S_xS_y} = \frac{C(x,y)}{\sqrt{S_x}^2\sqrt{S_y}^2}$$$ S: std ================================================================================