## Implementing Machine Learning Classification Algorithms to Recognize Handwritten Digits

Handwritten Digit Recognition is an interesting machine learning problem in which we have to identify the handwritten digits through various classification algorithms. There are a number of ways and algorithms to recognize handwritten digits, including Deep Learning/CNN, SVM, Gaussian Naive Bayes, KNN, Decision Trees, Random Forests, etc.

In this article, we will deploy a variety of machine learning algorithms from the Sklearn’s library on our dataset to classify the digits into their categories.

Let us first look at the dataset:

We will use Sklearn’s **load_digits** dataset, which is a collection of **8×8 **images (**64 features**)of digits. The dataset contains a total of **1797** **sample points**.

Let us import the dataset as **digits**:

`from sklearn.datasets import load_digits`

digits = load_digits()

The **DESCR** provides a description of the dataset. The dataset contains images of hand-written digits: 10 classes where each class refers to a digit (0, 1, 2, 3, 4, 5, 6, 7, 8, 9).

Let us visualize the first image of the handwritten digits stored in **images**, and plot it using **matplotlib**.

`import matplotlib.pyplot as plt `

plt.gray()

plt.matshow(digits.images[0])

plt.show()

The output will be as follows — it shows a handwritten* ‘0’*

And it can be verified by opening up the** images **NumPy array in the variable explorer, and resizing it so it can be easier to visualize:

Our dataset also consists of a NumPy array ‘**target**’ which stores the labels of all the **images**. The ‘**target_names**’ NumPy array stores the label values (the actual number which is the same in this case).

If our** images[35] **stores the following 8×8 image (which is the number 5):

The target array will store its label in the 35th index place — **target[35]**:

And the **target_names** array will be the names of the labels/classes, which in the case of digit recognition, are the digit themselves!

Once our dataset is loaded, we will start the coding by importing the **train_test_split** and performance **metrics**.

`from sklearn import metrics`

from sklearn.metrics import accuracy_score

from sklearn.model_selection import train_test_split

Next, we will prepare the data for training by declaring a NumPy array **data** and reshaping it so that it has the first dimension equal to the length of the **images**, which is the number of samples, **n_samples, **but with reduced dimensionality. So the dimension of **data** will be **1797 x 64**.

`n_samples = len(digits.images)`

data = digits.images.reshape((n_samples, -1))

The next step is to use the **train_test_split** function to split our data into 50% training and 50% testing data.

`X_train, X_test, y_train, y_test = train_test_split(`

data, digits.target, test_size=0.5, shuffle=False)

Our variable explorer will show the newly declared arrays:

Once we’re done with the above steps, we will use different algorithms as classifiers, make predictions, print the ‘Classification Report’, the ‘Confusion Matrix’, and the ‘Accuracy Score’.

The Classification Report will give us the precision, recall, f1-score, support, and accuracy, whereas the Confusion Matrix will show us the number of True Positives, False Positives, and False Negatives for each Classifier.

We will use the following classifiers from Sklearn:

- Support Vector Machine
- Gaussian Naive Bayes
- Decision Trees
- Random Forest
- K Nearest Neighbours
- Stochastic Gradient Descent

## 1. Support Vector Machines (SVM)

Here is the code to define the SVM classifier **svm_classifier** and use it to make predictions:

from sklearn import svm

svm_classifier = svm.SVC(gamma=0.001)

svm_classifier.fit(X_train, y_train)predicted = svm_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))

print("nClassification report for classifier %s:n%sn" % (svm_classifier, metrics.classification_report(y_test, predicted)))

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(svm_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", svm_classifier.score(X_test, y_test))

plt.show()

The SVM Classifier gives an Accuracy of **0.9688**!

The Classification Report it generates is as follows:

The Confusion Matrix is as follows:

## 2. Gaussian Naive Bayes

Next, we will define the Gaussian Naive Bayes as the classifier — **GNB_classifier**:

from sklearn.naive_bayes import GaussianNB

GNB_classifier = GaussianNB()

GNB_classifier.fit(X_train, y_train)predicted = GNB_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))

print("nClassification report for classifier %s:n%sn" % (GNB_classifier, metrics.classification_report(y_test, predicted)))

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(GNB_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", GNB_classifier.score(X_test, y_test))

plt.show()

The Classification Report for the Naive Bayes Classifier and the Confusion Matrix are as follows:

The accuracy of the Gaussian Naive Bayes Classifier was found out to be **0.8075**. Gaussian Naive Bayes Classifier is less accurate than the SVM Classifier (0.9688) when it comes to recognizing handwritten digits.

## 3. Decision Trees

Following is the code for defining the decision tree classifier **dt_classifier** and using it for making predictions:

from sklearn import tree

dt_classifier = tree.DecisionTreeClassifier()

dt_classifier.fit(X_train, y_train)predicted = dt_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)images_and_predictions = list(zip(digits.images[n_samples // 2:], predicted))

print("nClassification report for classifier %s:n%sn" % (dt_classifier, metrics.classification_report(y_test, predicted)))

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(dt_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", dt_classifier.score(X_test, y_test))

plt.show()

The Decision Tree Classifier had an even lower accuracy score = **0.7352.**

The Classification Report and Confusion Matrix of the Decision Tree Classifier are as follows:

## 4. Random Forest

Let us try using the Random Forest Classifier from sklearn. The classifier is defined as **RF_classifier** in the code below:

from sklearn.ensemble import RandomForestClassifier

RF_classifier = RandomForestClassifier(max_depth=2, random_state=0)

RF_classifier.fit(X_train, y_train)predicted = RF_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(RF_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", RF_classifier.score(X_test, y_test))

plt.show()

The accuracy score of the Random Forest Classifier is **0.7530 **— nearly that of the Decision Tree Classifier.

The Classification Report and Confusion Matrix are as follows:

So far, the SVM classifier gives the best accuracy score. But as we know, it is not the only metric to evaluate how well the algorithm works.

Let us try a few more classifiers from sklearn.

## 5. K Nearest Neighbours (KNN)

Our next classifier is defined by using KNN classification algorithm **KNN_classifier**:

from sklearn.neighbors import KNeighborsClassifier

KNN_classifier = KNeighborsClassifier(n_neighbors=5, metric='euclidean')

KNN_classifier.fit(X_train, y_train)predicted = KNN_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(KNN_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", KNN_classifier.score(X_test, y_test))

plt.show()

So the accuracy comes out to be** 0.9555**! Good thing we didn’t stop at Random Forest!

The Classification Report and Confusion Matrix for a KNN based Classifier is as follows:

## 6. Stochastic Gradient Descent

Let us try Stochastic Gradient Descent as our classifier **sgd_classifier**:

from sklearn.linear_model import SGDClassifier

sgd_classifier = SGDClassifier(loss="hinge", penalty="l2", max_iter=5)

sgd_classifier.fit(X_train, y_train)predicted = sgd_classifier.predict(X_test)_, axes = plt.subplots(2, 4)

images_and_labels = list(zip(digits.images, digits.target))

for ax, (image, label) in zip(axes[0, :], images_and_labels[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Training: %i' % label)

for ax, (image, prediction) in zip(axes[1, :], images_and_predictions[:4]):

ax.set_axis_off()

ax.imshow(image, cmap=plt.cm.gray_r, interpolation='nearest')

ax.set_title('Prediction: %i' % prediction)

disp = metrics.plot_confusion_matrix(sgd_classifier, X_test, y_test)

disp.figure_.suptitle("Confusion Matrix")

print("nConfusion matrix:n%s" % disp.confusion_matrix)

print("nAccuracy of the Algorithm: ", sgd_classifier.score(X_test, y_test))

plt.show()

The Stochastic Gradient Descent Algorithm gives us an accuracy score of **0.8932.**

Below are the Classification Report and the Confusion Matrix:

In this article, we used a bunch of machine learning algorithms from Sklearn’s library including **Support Vector Machine**, **Gaussian Naive Bayes**, **Decision Trees**,** Random Forests**, **K Nearest Neighbour,** and **Stochastic Gradient Descent**. These are some of the basic classification algorithms to get started with handwritten digit recognition.

In terms of accuracy score, the SVM classifier was the most accurate, whereas Decision Trees were the least!

Now let us come to the F1- score. Again, the SVM has the highest range of F1-Score:

Hence, we conclude that both in terms of accuracy score and F1-score, the **SVM classifier** performed the best. That is why you will often see it used in image recognition problems as well!

Here is an article I wrote in which I used SVM (along with PCA) to build a facial recognition model.

Let me know what other algorithms you could have used for your classifier! I also tried Ada Boost Classifier but it gave me such a low accuracy (0.2558) that I didn’t bother including in the article! 😅

## More Stories

## How to Choose the Best Nearest Neighbors Algorithm | by Marie Stephen Leo | Towards AI | Dec, 2020

## How Transfer Learning Can Make Machine Learning More Efficient – The New Stack

## What Is Big Data and How Does It Work? | by Malika Harkati | Cloudit-eg | Dec, 2020