011-lec-001. basic of cnn
@
Suppose you have image(32*32*3)
width*height*depth(rgb color)
One element represents one value
Suppose you have filter(5*5*3)
Filter creates one number
Color information will be processed in final step
@
Imagine you have 5 input values $$$x_{1}, x_{2}, x_{3}, x_{4}, x_{5}$$$
And you use Wx+b formular to create 'one number'
$$$\hat{y}=w_{1}x_{1}+w_{2}x_{2}+w_{3}x_{3}+w_{4}x_{4}+w_{5}x_{5}+b$$$
Here, in cnn, all weights in above formular are elements of filter
@
You also can use relu function as activation function
ReLu(Wx+b)
@
You slide filter along with all other elements of image
@
Then, think of how many numbers you can get from sliding filter
Figuring out these numbers are important,
when you define shape of weight,
when you build model
@
Suppose 7*7 image, 3*3 filter, 1 stride
Shape of output is 5*5
Suppose 7*7 image, 3*3 filter, 2 stride
Shape of output is 3*3
You can generalize above step
7: image
3: filter
2: stride
1: fixed constant
((7-3)/2)+1
@
As you perform sliding filter,
image becomes smaller and smaller,
which means you lose data of image
To resolve this, you can use 'padding' when you use cnn
Padding is putting 0 along with most outer area
There are 2 benefits of padding
1. Padding prevents rapidly losing image data
1. Padding lets you to know most outer area of image
@
7*7 image, 3*3 filter, 1 stride, 1 pixel padding
7*7 image becomes 9*9 image
((9-3)/1)+1=7, therefore 7*7
This means your raw image is 7*7,
and processed output image will become also 7*7
@
Sliding filter1 creates output1
Sliding filter2 creates output2
...
Sliding filter6 creates output6
If you add all outputs from above,
shape of summed output will be (28,28,6)
28,28: using this formular ((9-3)/1)+1=7
6: number of filters, number of outputs