This is notes which I wrote based on the following lectur
http://www.kocw.net/home/search/kemView.do?kemId=1189957
- Population_Sample_Sampling_distribution_Mean_Variance_Covariance_Correlation_coefficient
================================================================================
- Population: is a thing like all people in the country
- Sample: is like 100 people from population
- sampling distribution: statistical values form distribution
Sample 100 people -> statistical values like mean of height and weight
Sample 100 people -> statistical values like mean of height and weight
...
mean of height and weight can differ in each sample but they form a distribution
================================================================================
Statistical parameter: values which summarize characteristic of population
- Mean: $$$\bar{x}=\sum\limits_{i=1}^{n} \frac{x_i}{n}$$$
Sample $$$x_1=\{3,4,5,4,3,4,5\}$$$
$$$\bar{x_1}=\frac{28}{7}=4$$$
Sample $$$x_2=\{1,2,3,4,5,6,7\}$$$
$$$\bar{x_2}=\frac{28}{7}=4$$$
Sample $$$x_3=\{4,4,4,4,4,4,4\}$$$
$$$\bar{x_3}=\frac{28}{7}=4$$$
- Variance: how much data is scattered, how much is data far from the mean
$$$s^2=\sum\limits_{i=1}^{n} \frac{x_i-\bar{x}^2}{(n-1)}$$$
$$$s_1^2=\frac{(1+0+1+0+1+0+1)}{6}=\frac{4}{6}=0.67$$$
$$$s_2^2=\frac{(9+4+1+0+1+4+9)}{6}=\frac{4}{6}=4.67$$$
$$$s_3^2=\frac{(0+0+0+0+0+0+0)}{6}=\frac{4}{6}=0$$$
================================================================================
Standard deviation: when you calculate variance, its unit changes because you use power by 2
So, you use squre root on variance to match unit.
$$$s=\sqrt{s^2}=\sqrt{\sum\limits_{i=1}^{n} \frac{(x_i-\bar{x})^2}{(n-1)}}$$$
================================================================================
Covariance: when you have over-2 variates,
covariance shows pattern of change on those 2 variates.
- If random data of sample is bivariate data $$$(x_i,y_i)$$$ like height and weight
covariance is calculated as follow
- Calculate mean values
$$$\bar{x}=\sum\limits_{i=1}^{n} \frac{x_i}{n}$$$
$$$\bar{y}=\sum\limits_{i=1}^{n} \frac{y_i}{n}$$$
- Calculate covariance
$$$C(x,y)=\frac{1}{n-1} \sum\limits_{i=1}^{n} (x_i-\bar{x})(y_i-\bar{y})$$$
================================================================================
$$$x_1=\{3,4,5,4,3,4,5\}$$$
$$$\bar{x_1}=4$$$
$$$x_2=\{1,2,3,4,5,6,7\}$$$
$$$\bar{x_2}=4$$$
$$$C(x_1,x_2)=\frac{1*3-0*2-1*1+0*0-1*1+0*2+1*3}{6}=0.67$$$
================================================================================
You should normalize covariance value and it's correlation coefficient
Correlation coefficient is the value which represents correlation between 2 variates x and y
$$$0 \le |\rho_{xy}| \le 1 \Leftrightarrow -1 \le \rho_{xy} \le 1 $$$
$$$\rho = \frac{C(x,y)}{S_xS_y} = \frac{C(x,y)}{\sqrt{S_x}^2\sqrt{S_y}^2}$$$
S: std
================================================================================