Handwriting 99 Multiplication on App Store

Handwriting 99 Multiplication on App Store

DOI: 10.4018/978-1-7998-1554-9.ch004


The Modified NIST (MNIST) database, consisting of 70,000 handwritten digit images, in partition to 60,000 training patterns and 10,000 testing patterns, serves as a typical benchmark of evaluating performance of handwritten digit classification. After the LeNet CNNs model proposed by LeCun, researchers regarded this example as “Hello, World” in the field of deep learning. This chapter compares traditional approaches with the CNN model. The dataset of training and testing CNN models here is expanded to the Extension-MNIST (EMNIST) database. It will be employed to pre-train a CNN model for recognizing the handwritten digit image and installation on the iOS device. The user of the presented App can directly write digits on the touchscreen, and the smartphone instantly recognizes what were written. The pre-trained model subject to EMNIST database with a test accuracy of 99.4% has been integrated to an iOS App, termed as handwriting 99 multiplication, which has been successfully published on Apple's App Store.
Chapter Preview

Methods Of Analyzing Mnist

MNIST is a database of handwritten digits (LeCun, Cortes, & Burges, n.d.), which is induced from Special Database 1 and Special Database 3 of NIST. The MNIST database is more suitable than the NIST database for training machine learning models. The MNIST database has been pre-processed and digitized. The handwritten digit patterns of the MNIST database have been size-normalized and centered in a 28x28 pixel size image. Each pattern of the database is a grayscale image. This database contains 60,000 training samples and 10,000 test samples. It is a basic database for researchers who are practicing machine learning. Researchers can spend less time on data pre-processing. In addition to the method of convolutional neural networks proposed by LeCun, researchers have applied many learning techniques and classification methods to solve the problem of handwritten digits recognition, such as radial basis function networks (LeCun, Bottou, Bengio, & Haffner, 1998), neural networks (Ciresan, Meier, Gambardella, & Schmidhuber, 2010) (Salakhutdinov & Hinton, 2007), convolutional neural networks (LeCun, Bottou, Bengio, & Haffner, 1998) (Cireşan, Meier, Masci, Gambardella, & Schmidhuber, 2011), support vector machine (Decoste & Schölkopf, 2002), and k-nearest neighbor (Keysers, Deselaers, Gollan, & Ney, 2007), etc. Before entering our theme, let’s review the research methods of analyzing MNIST in the past.

First, the K-Nearest Neighbor (KNN) method is an intuitive classification method. Researchers believe that sample data of the same class will be clustered together in the raw or feature space. During the training phase, the KNN model will retain the training data and disperses the data in the feature space. Furthermore, during the test phase, we need to calculate the distances between the testing pattern and all training patterns in the raw or feature space. The KNN method will require the labels of the K training patterns that are closest to the testing pattern. The class of the testing pattern will be labeled as the category of the highest proportion of the K training labels. In other words, K training labels will vote to determine the class of the testing pattern. The concept of the KNN method is simple, but the pre-processing of the data set will affect the accuracy of the model. Lower the relevance of the data classes, better the classification ability of the model. Currently, the KNN classification method can achieve an error rate of 0.52% in the MNIST example (Keysers, Deselaers, Gollan, & Ney, 2007). Although the accuracy of KNN is higher than many machine learning methods, it seems to be unreasonable in the topic of artificial intelligence to simulate brain recognition by reserving all training patterns like the KNN method for classification.

When it comes to using machine learning to deal with classification problems, the first method that comes to mind is the Support Vector Machine (SVM). The technology of SVM has accumulated for decades. In 1963, Vladimir Naumovich Vapnik and Alexey Yakovlevich Chervonenkis proposed the basic algorithm of SVM. Furthermore, two SVM-related papers (Boser, Guyon, & Vapnik, 1992) (Cortes & Vapnik, 1995) were proposed in 1992 and 1993 and published in 1992 and 1995. The SVM is built on rigorous mathematical theory. This method is to find an optimal separating hyperplane to separate the data points of different groups. For example, given P-dimension data points come from two different groups, the SVM will look for an optimal hyperplane of the (P-1)-dimension to divide the data points into two categories. In the field of machine learning, the hyperplane is also called the decision boundary. It means that the label of the test data point depends on one side of the decision boundary. In the course of the experiment, we will find a lot of hyperplanes that can divide the data points into two categories, but there is only one optimal separating hyperplane. The optimal separating hyperplane is also called the maximum interval hyperplane. This concept is like the problems of mathematical optimization. The optimal decision boundary of the SVM will maximize the distance of the nearest data point on each side. For the MNIST example, the SVM method can achieve an error rate of 0.56% (Decoste & Schölkopf, 2002).

Complete Chapter List

Search this Book: