Input Output for Document Classifier

Input Output for Document Classifier

DOI: 10.4018/978-1-7998-3772-5.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

The report generated displays a list of automatically generated keywords in each document. A document is allowed to have any number of keywords. As the keywords are getting generated at any pass of the loop, there is no restriction on the width of keywords. Another report is also generated to display the list of the document class. If a document finds its match with more than one class (overlapping classes), the selection of the final class for a document is done on the basis of the maximum weight of the keywords in each class.
Chapter Preview
Top
Figure 1.

The hierarchy chart of the menu interface of the document classifier

978-1-7998-3772-5.ch009.f01
Top

Input Design

Welcome Screen

Figure 2.

­

978-1-7998-3772-5.ch009.f02

This form displays the sequence of processes carried out in Text Mining such as

  • 1.

    Text Preprocessing (Syntactic/Semantic Text Analysis)

  • 2.

    Features Generation (Bag of Words)

  • 3.

    Feature Selection (Simple Counting, Statistics)

  • 4.

    Text/Data Mining (Classification- Supervised Learning)

  • 5.

    Classification (Unsupervised Learning)

  • 6.

    Analyzing Results

Automatic Keyword Extraction and Text Classification (Main Form)

Figure 3.

­

978-1-7998-3772-5.ch009.f03

This is the main form for the system. It gives access to individual form designs such as

  • 1.

    Class Entry

  • 2.

    Document Entry

  • 3.

    Stop Word Entry

  • 4.

    Keyword Extraction

  • 5.

    Reports

Buttons Used

Figure 4.

­

978-1-7998-3772-5.ch009.f04
  • Add: This button allows adding records to tables.

  • Modify: This button is used to perform modifications to the existing records of the specified table. As the records are getting populated in the “Display Matrix” of each form, it is possible to fetch the needed record from the matrix, update the record and save it to table.

  • Clear: If the operation is not valid, this button clear the entries made on the form.

  • Close: This button allows closing the currently active form and reverting to main form.

  • Exit: This button is used to Exit the project.

Predefined Class Entry Form

Figure 5.

­

978-1-7998-3772-5.ch009.f05

Here it is possible to add details of new class. Each new class is identified with its code and corresponding keyword details are entered. The examples of classes in computer science could be,

  • 1.

    Data base management system

  • 2.

    Data mining concepts

  • 3.

    Computer architecture

  • 4.

    Computer networks etc.,

Complete Chapter List

Search this Book:
Reset