How to choose the window size of CNN in deep learning

Question

In Convolutional Neural Network (CNN), a filter is select for weights sharing. For example, in the following pictures, a 3x3 window with the stride (distance between adjacent neurons) 1 is chosen.

So my question is: How to choose the window size? If I use 4x4 with the stride being 2, how much difference will it cause? Thanks a lot in advance

Dev · Answer 1 · Mar 30, 2022

There is no definitive solution to this question: filter size is one of the hyperparameters that must be tuned in most cases. There are, however, some important remarks that may be of assistance to you. Smaller filters with a greater number of them are frequently desired.
Four 5x5 filters, for example, have 100 parameters (ignoring bias), while ten 3x3 filters have 90. You may still capture the variety of features in the image with the larger of filters, but with fewer parameters.
Modern CNNs take this concept a step further by using 3x1 and 1x3 convolutional layers in succession. This further reduces the number of parameters, but has no effect on performance.
The stride chosen is similarly significant, but it has an impact on the tensor shape after convolution, and hence the entire network. The basic norm is to use stride=1 in regular convolutions and padding to preserve the spatial size, and stride=2 when downsampling the image.

To learn more, visit our deep learning course