Automated Math Symbol Classification Using SVM

Automated Math Symbol Classification Using SVM

Vaidehi K., Manivannan R.
Copyright: © 2022 |Pages: 14
DOI: 10.4018/IJeC.304037
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Handwritten character/symbol recognition is an important area of research in the present digital world. The solving of problems such as recognizing handwritten characters/symbols written in different styles can make the human job easier. Mathematical expression recognition using machines has become a subject of serious research. The main motivation for this work is both recognizing of the handwritten mathematical symbol, digits and characters which will be used for mathematical expression recognition. The system first identifies the contour in handwritten document segmentation and features extracted are given into SVM classifier for classification. GLCM and Zernike Moments are the two different feature extraction techniques used in this work. SVM with RBF kernel is used for classification. Zernike Moment features overperforms than GLCM. Zernike Moment achieves 97.89% accuracy and GLCM achieves 87.61% accuracy.
Article Preview
Top

1. Introduction

Handwriting recognition (HWR) (MacLean et al., 2015) is a computer process performed to obtain and understand the handwritten input such as touch-screen, paper documents, photographs, and other devices. The images of the written text papers are called as “off line” taken by optical image scanning (or intelligent word recognition). The motion of the pen tip felt generally on a pen-based computer screen surface can be called as “on line”, a easier task as there are more options available. Handwriting recognition primarily follows the process of optical character recognition (OCR) (Singh et al., 2015 and Dai et al., 2018). The handwriting recognition system handles and includes formatting of the document, performs the correct segmentation into characters and also finds the most possible words.

The most desirable property of handwritten character recognition systems is their ability to cope with variations in writing style while distinguishing similar characters (Mouchere et. al, 2011). In order to achieve this property, improvements can be made at any stage of the recognition process. This research particularly focuses on improving the classification stage, by introducing alternative training methods that enhance the discriminative abilities of the recognizers.

Handwriting recognition primarily follows the process of optical character recognition (OCR). The handwriting recognition system handles and includes formatting of the document, performs the correct segmentation into characters and also finds the most possible words.

OCR (Singh et al., 2015) can be both mechanical and electronic converter. The conversion includes conversion of handwritten image, typed image or printed text into machine-encoded text, taken from a photograph of a document, a scanned document, a scene-photo from subtitle text superimposed on an image. OCR is usually an “offline “process that static document. Handwritten moments are taken as an input to the handwritten recognition system and the input data is the static representation of the handwriting. OCR machines are primarily uses machine printed text and ICR (capital letters) for hand printed text. The shapes of glyphs and words make motion capturing easy when taken as input to the technique. The motions captured are the order of drawing the segments, the direction, and the sequence in which the pen is put down and lifted up. With the help of this additional information the accuracy of an end-to-end process can be increased. This technique can also called the “intelligent character recognition”, “on-line character recognition”, “dynamic or real-time character recognition”.

On-line handwriting character recognition (Mouchere et. al, 2011) takes the input from the special digitizer or PDA. The sensor picks up both the pen-tip movements and pen-up/pen-down switching i.e. lifting and putting down of pen. The data collected by the use of this method is called as digital ink. The ink can be considered as a digital representation of handwriting. The signals are converted to the letter codes and can be used in text-processing applications in the computers.

Early versions of character recognition needs to be trained with all images of each character, and has to be operated on one font at a time. But the advanced systems which are used today are capable of achieving high recognition accuracy for most fonts that are commonly used, and is completed with help of various digital image file format inputs. Some of the systems are also capable of providing outputs of the formatted pages which are approximately same as the original page including images, columns, and other components (non-text).

Humans can easily recognize the handwritten document but the recognition of the same by the computer system becomes difficult for it due to the present of random variations in the noise in image, writing size, fonts and styles.

Complete Article List

Search this Journal:
Reset
Volume 20: 1 Issue (2024)
Volume 19: 7 Issues (2023)
Volume 18: 6 Issues (2022): 3 Released, 3 Forthcoming
Volume 17: 4 Issues (2021)
Volume 16: 4 Issues (2020)
Volume 15: 4 Issues (2019)
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing