Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Fuzzy Matching based Image Classification System for Printed and Handwritten Text Documents

Shalini Puri, Satya Prakash Singh

Source Title: Journal of Information Technology Research (JITR) 13(2)

DOI: 10.4018/JITR.2020040110

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

This article proposes a bi-leveled image classification system to classify printed and handwritten English documents into mutually exclusive predefined categories. The proposed system follows the steps of preprocessing, segmentation, feature extraction, and SVM based character classification at level 1, and word association and fuzzy matching based document classification at level 2. The system architecture and its modular structure discuss various task stages and their functionalities. Further, a case study on document classification is discussed to show the internal score computations of words and keywords with fuzzy matching. The experiments on proposed system illustrate that the system achieves promising results in the time-efficient manner and achieves better accuracy with less computation time for printed documents than handwritten ones. Finally, the performance of the proposed system is compared with the existing systems and it is observed that proposed system performs better than many other systems.

Article Preview

Top

Introduction

The automated English Document Image Classification System (EDC) is designed to categorize the scanned pure printed or pure Hand-Written (HW) English documents into mutually exclusive predefined classes. This system is defined in 2 ways, which are English Printed Document Classification (EPDC) system and English Handwritten Document Classification (EHDC) system. Both EPDC and EHDC systems use the underlying concepts of pattern recognition, artificial intelligence, machine learning, text mining, and image mining fields. Over the last three decades, many researchers have successfully implemented various automated text mining and classification systems. These researchers have tested their systems on mono, bi, tri, and multi-lingual real and synthetic documents by using various classifiers and variations. Some other researchers have also provided good solutions for the problems of feature reduction, feature selection, and data and curse of dimensionality reductions. On the other side, in recent years, many automated image recognition, identification, and mining systems have also been developed. These systems were designed primarily for the categorization of maps, geographical areas, drawings, and graphical and pictorial designs. Nowadays, many other image mining systems have also come into existence, which extract and process text characters, words, and lines from the heterogeneous set of multi-font, multi-size, multi-oriented, multi-colored, multi-lingual and multi-script documents. The fields of printed character recognition and script discrimination for non-Indic, such as, Latin, Chinese, Japanese and Korean scripts are already mature. On the other side, many printed text recognizers and processors also exist for Indic scripts, such as, Devanagari, Bengali, Gujarati, and Gurumukhi etc. The printed text processing systems are always found simpler than the handwritten ones. The reason behind the complexity of handwritten text processing primarily lies in the cursive writing style, overlapped and touched characters, and uneven height, size and gaps among the characters and words. Secondly, it also depends on the writer how smoothly and clearly he writes the text. Many Indian scripts also use a head line on the top of the characters, which also increase the segmentation issues. All these conceptual illustrations of text classification systems and image mining systems have motivated the authors to propose an integrated single and multi-script document image classification system, which accepts the text document images and categorizes them into predefined classes. In this way, the area of document classification coexists with the image content retrieval and recognition paradigm.

These new dimensions of text document image processing include the major steps of preprocessing, character recognition, word recognition, and document classification. Nowadays, many researchers are paying attention to it. Puri and Singh (2018) provided a survey on Devanagari scripted Hindi text document classification system by using Support Vector Machine (SVM) and fuzzy. This survey primarily focused upon Hindi basics, importance, survival, and differentiation between Hindi and other scripts, and then it provided detailed discussions on existing research contributions from 1990 to till date. Another research contribution is a tri-layered segmentation and bi-leveled classifier based advanced, robust, fast Hindi Printed Document Classification using SVM and Fuzzy (HPDC-SF), which discussed detailed algorithmic procedures for document classification (Puri & Singh, 2019). The HPDC-SF system was designed to categorize unknown documents into predefined Hindi classes through the critical Task Stages (TS) of segmentation, Shirorekha-Less (SL) character extraction, SL word association, fuzzy matching, and classification. This system used Predefined Keywords (PK) in Romanized form of Hindi characters.

Complete Article List

Search this Journal:

Reset

Volume 16: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 15: 6 Issues (2022): 1 Released, 5 Forthcoming

Volume 14: 4 Issues (2021)

Volume 13: 4 Issues (2020)

Volume 12: 4 Issues (2019)

Volume 11: 4 Issues (2018)

Volume 10: 4 Issues (2017)

Volume 9: 4 Issues (2016)

Volume 8: 4 Issues (2015)

Volume 7: 4 Issues (2014)

Volume 6: 4 Issues (2013)

Volume 5: 4 Issues (2012)

Volume 4: 4 Issues (2011)

Volume 3: 4 Issues (2010)

Volume 2: 4 Issues (2009)

Volume 1: 4 Issues (2008)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Fuzzy Matching based Image Classification System for Printed and Handwritten Text Documents

Abstract

Introduction

Complete Article List