Hackers Realm
MNIST Handwritten Digits Recognition using Python | Image Classification | Deep Learning Tutorial
Updated: Apr 15
The MNIST Handwritten Digits Recognition is an image classification project where we have to analyze and recognize handwritten digits. This is a basic deep learning project for beginners to learn.

In the previous projects we have been using Jupyter Notebook interface to code and process all our dataset and the corresponding models. At this point forward for all our future projects for deep learning and machine learning, we will be using the Google Colab interface.
You can watch the step by step explanation video tutorial down below
To know more and get started with Google Colab, click here
Dataset Information
This dataset allows you to study, analyze and recognize elements in the images. That’s exactly how your camera detects your face, using image recognition! It’s a digit recognition problem. This data set has 49,000 images of 28 X 28 size, totaling 49 MB.
Download the dataset here
Import Modules
First, we have to import all the basic modules we will be needing for this project.
!pip install tensorflow-gpu keras
Installation of tensorflow-gpu backend module and keras API
import pandas as pd
import numpy as np
from tqdm.notebook import tqdm
from keras.preprocessing.image import img_to_array, load_img
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline
import warnings
warnings.filterwarnings('ignore')
pandas - used to perform data manipulation and analysis
numpy - used to perform a wide variety of mathematical operations on arrays
tqdm.notebook - progressbar decorator for iterators. Includes a default range iterator printing to stderr.
keras.preprocessing.image - basic preprocessing step for loading the image and converting the image to a numpy array
tensorflow – backend module for the use of Keras
matplotlib - used for data visualization and graphical plotting
warnings - to manipulate warnings details
%matplotlib - to enable the inline plotting
filterwarnings('ignore') is to ignore the warnings thrown by the modules (gives clean results)
Unzip the train data
We need to extract the dataset for training and testing using unzip
!unzip Train_UQcUa52.zip
Load the data
df = pd.read_csv('train.csv')
df.head()

pd.read_csv() - read csv file to view the data
Filename is the image name
Label determines the digit value in the image
Data must be converted into an array for processing
!pwd
/content
Current directory path
image_path = 'Images/train/'
Storing the path of the image data
Now we load all the dataset into an array for processing
X = np.array([img_to_array(load_img(image_path+df['filename'][i], target_size=(28,28,1), grayscale=True))
for i in tqdm(range(df.shape[0]))
]).astype('float32')
Array where the appended images will be stored
target_size=(28,28,1) - the images are in grayscale so the channel mode is one (1), in a 28 x 28 plot size
grayscale=True - indication to the function that it’s a grayscale image
range(df.shape[0]) - retrieve the length of the whole training data set
tqdm() - display of overall process with a loading bar on each iteration of each image being converted into an array
astype('float32') - conversion of numpy array into a float 32 data type
y = df['label']
Loading label data to the variable y
print(X.shape, y.shape)
(49000, 28, 28, 1) (49000,)
Input data has 49000 samples with 28 x 28 size, at one (1) channel that is grayscale
Output attribute has 49000 samples
Exploratory Data Analysis
In Exploratory Data Analysis (EDA), we will visualize the data with different kinds of plots for inference. It is helpful to find some patterns (or) relations within the data
image_index = 0
print(y[image_index])
plt.imshow(X[image_index].reshape(28,28), cmap='Greys')
4

y[image_index] returns the output value
X[image_index].reshape(28,28), cmap='Greys' returns the input image sample reshaping in 28 x 28 size in grayscale
Display of the value returned into a plot graph in grayscale
image_index = 10
print(y[image_index])
plt.imshow(X[image_index].reshape(28,28), cmap='Greys')
2

image_index = 100
print(y[image_index])
plt.imshow(X[image_index].reshape(28,28), cmap='Greys')
7

Train-Test Split
We must split the dataset for training and testing
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42, stratify=np.array(y))
stratify=np.array(y) - parameter that will split the test data with uniform distribution, without this it will randomly split the data and possibly not cover all the test data
Normalization
Pre-processing step to normalize all the values in the data between zero (0) to one (1) to obtain better results
x_train[0]
x_train /= 255
x_test /= 255
x_train[0]
x_train[0] visualization of the values of the data
All data will have a pixel value between 0 and 255
x_train /= 255, x_test /= 255 - command to normalize the values in the dataset.
Run it only once, more than one will divide the data again by 255 and mess up the data
x_train[0] again to view the data after normalization
Now all the data should have a value between 0 and 1
Model Creation
input_shape = (28,28,1)
output_class = 10
input_shape - size and color scale of our plot graph
output_class - no. of output classes
Let us create the model for training
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
# define the model
model = Sequential()
model.add(Conv2D(28, kernel_size=(3,3), input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(128, activation=tf.nn.relu))
model.add(Dropout(0.3))
model.add(Dense(output_class, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')
Basic CNN model used for this project
Dense - single dimension linear layer array
Conv2D - model used for 2D images
Dropout - used to add regularization to the data, avoiding over fitting dropping out a fraction of the data
Flatten - convert and 2D array into a single array
MaxPooling2D - function to get the maximum pixel value to the next layer
Kernel_size=() - filter size
Model.compile() - compilation of the data
Optimizer=’adam’ - automatically adjust the learning rate for the model over the number of epochs
Loss=’sparse_categorical_crossentropy’ - loss function for category outputs
The activation function defines the output data, use sigmoid for binary classification
Training the model for the entire dataset
# train the model
model.fit(x=x_train, y=y_train, batch_size=32, epochs=30, validation_data=(x_test, y_test))
Display of the results after training the data
Batch_size=32 - amount of images to process per iteration
Epochs=30 - no. of iterations for training
Highest validation accuracy is 98.38
Highest training accuracy is 99.59
Training too much will over fit the data &adjust the number of epochs reasonably
Epoch 1/30 1149/1149 [==============================] - 10s 3ms/step - loss: 0.4816 - accuracy: 0.8475 - val_loss: 0.1202 - val_accuracy: 0.9637 Epoch 2/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.1336 - accuracy: 0.9605 - val_loss: 0.0848 - val_accuracy: 0.9743 Epoch 3/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0863 - accuracy: 0.9732 - val_loss: 0.0807 - val_accuracy: 0.9742 Epoch 4/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0685 - accuracy: 0.9783 - val_loss: 0.0734 - val_accuracy: 0.9788 Epoch 5/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0543 - accuracy: 0.9825 - val_loss: 0.0690 - val_accuracy: 0.9809 Epoch 6/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0461 - accuracy: 0.9844 - val_loss: 0.0684 - val_accuracy: 0.9808 Epoch 7/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0360 - accuracy: 0.9873 - val_loss: 0.0743 - val_accuracy: 0.9798 Epoch 8/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0318 - accuracy: 0.9884 - val_loss: 0.0733 - val_accuracy: 0.9811 Epoch 9/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0319 - accuracy: 0.9891 - val_loss: 0.0658 - val_accuracy: 0.9838 Epoch 10/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0242 - accuracy: 0.9919 - val_loss: 0.0728 - val_accuracy: 0.9827
Epoch 11/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0218 - accuracy: 0.9926 - val_loss: 0.0815 - val_accuracy: 0.9818 Epoch 12/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0286 - accuracy: 0.9895 - val_loss: 0.0766 - val_accuracy: 0.9829 Epoch 13/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0199 - accuracy: 0.9928 - val_loss: 0.0762 - val_accuracy: 0.9820 Epoch 14/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0239 - accuracy: 0.9918 - val_loss: 0.0754 - val_accuracy: 0.9836 Epoch 15/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0160 - accuracy: 0.9938 - val_loss: 0.0865 - val_accuracy: 0.9820 Epoch 16/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0196 - accuracy: 0.9935 - val_loss: 0.0842 - val_accuracy: 0.9822 Epoch 17/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0152 - accuracy: 0.9951 - val_loss: 0.0825 - val_accuracy: 0.9828 Epoch 18/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0155 - accuracy: 0.9943 - val_loss: 0.0889 - val_accuracy: 0.9817 Epoch 19/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0207 - accuracy: 0.9930 - val_loss: 0.0886 - val_accuracy: 0.9822 Epoch 20/30 1149/1149 [==============================] - 4s 3ms/step - loss: 0.0122 - accuracy: 0.9955 - val_loss: 0.0958 - val_accuracy: 0.9822
Epoch 21/30 1149/1149 [==============================] - 4s 3m