I was taking a look at Convolutional Neural Network from CS231n Convolutional Neural Networks for Visual Recognition. In Convolutional Neural Network, the neurons are arranged in 3 dimensions(height, width, depth). I am having trouble with the depth of the CNN. I can't visualize what it is.

In the link they said The CONV layer's parameters consist of a set of learnable filters. Every filter is small spatially (along width and height), but extends through the full depth of the input volume.

For example loook at this picture. Sorry if the image is too crappy. I can grasp the idea that we take a small area off the image, then compare it with the "Filters". So the filters will be collection of small images? Also they said We will connect each neuron to only a local region of the input volume. The spatial extent of this connectivity is a hyperparameter called the receptive field of the neuron. So is the receptive field has the same dimension as the filters? Also what will be the depth here? And what do we signify using the depth of a CNN?

So, my question mainly is, if i take an image having dimension of [32*32*3] (Lets say i have 50000 of these images, making the dataset [50000*32*32*3]), what shall i choose as its depth and what would it mean by the depth. Also what will be the dimension of the filters?

Also it will be much helpful if anyone can provide some link that gives some intuition on this.

EDIT: So in one part of the tutorial(Real-world example part), it says The Krizhevsky et al. architecture that won the ImageNet challenge in 2012 accepted images of size [227x227x3]. On the first Convolutional Layer, it used neurons with receptive field size F=11, stride S=4 and no zero padding P=0. Since (227 - 11)/4 + 1 = 55, and since the Conv layer had a depth of K=96, the Conv layer output volume had size [55x55x96].

Here we see the depth is 96. So is depth something that i choose arbitrarily? or something i compute? Also in the example above(Krizhevsky et al) they had 96 depths. So what does it mean by its 96 depths? Also the tutorial stated Every filter is small spatially (along width and height), but extends through the full depth of the input volume.

So that means the depth will be like this? If so then can i assume Depth = Number of Filters? Mar 23, 2022 268 views

## 1 answer to this question.

The depth of a Deep Neural Network relates to how deep the network is, however in this case, the depth is employed for visual identification and corresponds to the image's third dimension.

In this situation, you have an image with a dimension of 32x32x3, which is 32x32x3 (width, height, depth). As depth translates to the different channels of the training images, the neural network should be able to learn depending on these characteristics.

The number of filters used by the CONV layer determines its depth. The depth of a filter is equal to the depth of the image it is processing.

As an example, suppose you're using a 227*227*3 image. Assume you're using a filter with a size of 11*11. (spatial size). As a reaction, this 11*11 square will be slid across the entire image, yielding a single two-dimensional array. However, it must cover every part of the 11*11 space in order to accomplish so. As a result, the depth of the filter will be equal to the depth of the image. Assume we have 96 such filters, each of which produces a different result. This is the Convolutional layer's depth. It simply refers to the number of filters used.
• 5,480 points

## What is the difference between Deep Learning and traditional Artificial Neural Network machine learning?

A large number of layers causes serious ...READ MORE

## What does backbone mean in a neural network?

According to my understanding, the "backbone" refers ...READ MORE

## Is predicting number of sales a Regression or Classification problem?

The output will be discrete but the ...READ MORE

## What is the difference between a Confusion Matrix and Contingency Table?

Confusion Matrix is a classification matrix used ...READ MORE

## How do neural networks used in AI and deep learning learn?

A neural network is a hardware or software ...READ MORE

## Role of the bias in neural networks.

Hi@akhtar, The activation function in Neural Networks takes ...READ MORE

## Epoch vs Iteration when training neural networks

Epoch can be understood as the number ...READ MORE

## Epoch vs Iteration when training neural networks

In the language of neural networks, this ...READ MORE