http://www.kocw.net/home/search/kemView.do?kemId=1189957
================================================================================
estimated_PDF=probability_density_estimation_using_k_NNR(sample_data,bayes_classifer_func)
================================================================================
* Bayes classifer
$$$P(\omega_i|x) \\
= \frac{P(x|\omega_i) P(\omega_i)}{P(x)}$$$
* $$$P(x|\omega_i)$$$: likelihood
* $$$P(\omega_i)$$$: prior probability
$$$= \dfrac{\frac{k_i}{N_iV}\frac{N_i}{N}}{\frac{k}{NV}}$$$
* Likelihood $$$P(x|\omega_i)$$$ is PDF
* You can try estimate Likelihood $$$P(x|\omega_i)$$$ by using k-NNR
* Likelihood $$$P(x|\omega_i)$$$ by using k-NNR is written: $$$P(x|\omega_i)=\frac{k_i}{N_iV}$$$
* Unconditional density $$$P(x)$$$ is estimated into $$$P(x)=\frac{k}{NV}$$$ via k-NNR
* Prior probability $$$P(\omega_i)$$$ is approximated into $$$P(\omega_i)=\frac{N_i}{N}$$$ via k-NNR
$$$= \frac{k_i}{k}$$$
* $$$k_i$$$: number of sample in ith class
* $$$k$$$: number of all sample in specific region
* Meaning: by using relative frequency (non parametric density estimation),
you can predict "class $$$\omega_i$$$" for given feature data $$$x$$$
================================================================================
* Non parametric density estimation
* Pros
- Easy analysis
- Easy implementation
- When infinite number of samples are given, this is optimal method
- Considers surrounding samples also around sample, so it creates adaptive results
- Easy parallel processing
* Cons
- Large data should be stored into memory
- Large computation time
- Vulnerable by curse of dimension
Curse of dimension: more empty spaces, estimating PDF can be incorrect at those areas
================================================================================
* k: "how many sample" should be "involved" into "given volumne"
* $$$k=1$$$, 1-NNR
* $$$k>1$$$, k-NNR
================================================================================
* Large k
* Pros
- smoothed decision boundary
- KDE: non smooth decision boundary (which can be resolved by using smooth kernel)
- Supplies probabilistic information
Large k, almost becomes PDF
* Cons
- Since too many samples are considered into one "volume", local information is removed
(only creates generalized result)
- Much computation