Skip to content

Convolutional Neural Network model

Here in this post we deal with CNN model, for the network described in previous article “convolutional neural network” .  Here we discuss the network model itself, i.e. “core” of our python program which is actually performing image recognition task.

The model itself.

Here the code:

model = Sequential()
model.add(Conv2D(32, (5, 5), input_shape=(X_train.shape[1], X_train.shape[2], 1), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(number_of_classes, activation='softmax'))


Lets break code for lines and describe each line separately.

——————-

Line below initialize the model, model with type “Sequential”, the Sequential model is a linear stack of layers, meaning that several network layers come one after another, performing own operation and passing it’s output to next layer, the output of previous layer is the input of next layer.

model = Sequential()

——————-

Line below is a first layer in our model, it is “convolution” type of layer. What it does is performing multiplication of input image to some “filters”, number of filters and filter dimensions are specified in this line of code, i.e. there are 32 different filters, each filter has 5×5 pixel size, the activation function at the layer output is ReLU.
It is not easy to explain convolution layer with text description, it is more easy to explain it with video, so we advise you to see following nice videos:

https://www.youtube.com/watch?v=IeLrFNc5HqY
https://www.youtube.com/watch?v=BcEapJEKz3M
https://www.youtube.com/watch?v=7Wq-QmMT4gM

model.add(Conv2D(32, (5, 5), input_shape=(X_train.shape[1], X_train.shape[2], 1), activation='relu'))

——————-

Line below is for pooling, pooling is the technique used for decreasing the size of images(data matrix) to make computation faster but still keeping all key characteristics of given image. What MaxPooling is doing is finding 1 max value out of 4 pixels(2×2) and keeping that one maximum value as a representation of previous 4 pixels, thus reducing image(data matrix) dimension 4 times.

Here is small picture with the representation of MaxPooling:

Very good video with explanation of Max pooling is here:
https://www.youtube.com/watch?v=ZjM_XQa5s6s


Line below is for MaxPooling, i.e. reducing image 4 times by taking 1 max value from 4 pixels (2×2 pixel box)

model.add(MaxPooling2D(pool_size=(2, 2)))

——————-

Next line below is for second convolutional layer, it is doing exactly the same as previous convolutional layer, but instead of 5×5 pixel filter this layer uses smaller 3×3 pixel filters.

model.add(Conv2D(32, (3, 3), activation='relu'))

——————-

Next line is another MaxPooling that works after convolutional layer and used for dimensionality reduction, i.e. making all previous “images” 4 times less in size but still keeping all key characteristics.

model.add(MaxPooling2D(pool_size=(2, 2)))

——————-

Line below is used for “Dropout” which we discussed previously in “Overfitting and Dropout” article. In short, it perform such a way that it drops 20% of input data randomly on each data pass throught layer, thus prevents the network to be adjusted to the exact data that is used for training, preventing network to be precisely tailored and adjusted on data that used on training phase thus degrading network performance on other new data that was not used during network training.

model.add(Dropout(0.2))

——————-

Line below used to convert 2-d data into flat data. Why so? The last stage of a convolutional neural network (CNN) is a classifier. It is called a dense layer, which is just an artificial neural network (ANN) classifier. And an ANN classifier needs individual features, just like any other classifier. This means it needs a feature vector. Therefore, you need to convert the output of the convolutional part of the CNN into a 1D feature vector, to be used by the ANN part of it. This operation is called flattening.
Here is visual representation of what this line of code is doing:

Here the code:

model.add(Flatten())

——————-

Line below is next layer which is “normal” fully-connected “dense” layer, i.e. it has 128 neurons, each neuron connected to all neurons from previous layer, this layer has ReLU activation function.

Here the code:

model.add(Dense(128, activation='relu'))

——————-

Last line of our code is to create last output layer. In this model we dealing with 10 categories(10 digits, from 0 to 9), so we use “softmax” activation function that typically used for categorical data (while for binary 0/1 classification usually activation=’sigmoid’ is used). This last line of code creates “normal” fully-connected “dense” layer, each neuron connected to all neurons from previous layer, while for output we use “softmax” activation and number of classes equals to 10.

Here the code:

model.add(Dense(number_of_classes, activation='softmax'))

——————-

That’s it, we discussed here the neural network model used for image classification, while full code of Convolutional Neural Network can be found at previous article.

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *