This is personal study note
Copyright and original reference:
https://www.youtube.com/watch?v=SYdqCHfMycM&list=PLsri7w6p16vu3mMWzijxOmhrlvN23W04_&index=3
================================================================================
How to "see correlational relationship" between "variables"
- Use scatter plot
================================================================================
- Covariance:
- "Scattering of 2 random variables" is in "positive direction" or "negative direction"?
- When "random variable X" changes, how "random variable Y" changes?
- There can be "variance" of "random variable A"
- There can be "variance" of "random variable B"
- There can be "common variance" from "A and B"
================================================================================
How to calculate covariance
- $$$Cov(X,Y) = \dfrac{\sum\limits_{i=1}^{N} (X_i-\bar{X}) (Y_i-\bar{Y}) }{N}\\$$$
- You can see variance of X $$$(X_i-\bar{X})$$$ and variance of Y $$$(Y_i-\bar{Y})\\$$$
- $$$Covariance = \dfrac{\text{sum[(each_X_data-mean_of_X)*(each_Y_data-mean_of_Y)]}}{\text{num_combination}}\\$$$
- $$$Covariance = \dfrac{\text{sum[mean_deviation_of_X*mean_deviation_of_Y]}}{\text{num_combination}}$$$
================================================================================
- Correlation coefficient
================================================================================
* Example
$$$\bar{x}_{\text{ad_price}} = \dfrac{13+8+\cdots+21+25}{15} = 16.467$$$
$$$\bar{x}_{\text{profit}} = \dfrac{94+70+\cdots+105+121}{15} = 98.933$$$
================================================================================
* Deviation values
================================================================================
$$$Cov = \dfrac{17.103+244.976+\cdots+27.502+188.298}{15} = \dfrac{703.471}{15} = 46.898$$$
================================================================================
How does 46.898 mean?
It's "positive value" so "positive correlational relationship"
But you can't see the intensity of correlation
================================================================================
For above limitation of covariance, you can use "correlation coefficient"
correlation_coefficient=normalize(covariance)
================================================================================
$$$Cov(X,Y) = \dfrac{\sum\limits_{i=1}^{N} (X_i-\bar{X}) (Y_i-\bar{Y}) }{N}\\$$$
$$$Corr(X,Y) = \dfrac{Cov(X,Y)}{ \sqrt{ \dfrac{\sum (X-\mu)^2}{N} \cdot \dfrac{\sum (Y-\mu)^2}{N} } } $$$
$$$Corr(X,Y) = \dfrac{Cov(X,Y)}{ \sigma_Y \cdot \sigma_Y } $$$
$$$Corr(X,Y) = \dfrac{\text{Cov of X and Y}}{ \text{(std_of_X)} \times \text{(std_of_Y)} } $$$
================================================================================
================================================================================
standard deviation of ad_price and profit
================================================================================
Calculate correlation coefficient
-1 < correlation coefficient < 1