Reference Hub3
Ensemble Classification System for Scientific Chart Recognition from PDF Files

Ensemble Classification System for Scientific Chart Recognition from PDF Files

S. Nagarajan, V. Karthikeyani
Copyright: © 2012 |Volume: 2 |Issue: 4 |Pages: 10
ISSN: 2155-6997|EISSN: 2155-6989|EISBN13: 9781466611283|DOI: 10.4018/ijcvip.2012100101
Cite Article Cite Article

MLA

Nagarajan, S., and V. Karthikeyani. "Ensemble Classification System for Scientific Chart Recognition from PDF Files." IJCVIP vol.2, no.4 2012: pp.1-10. http://doi.org/10.4018/ijcvip.2012100101

APA

Nagarajan, S. & Karthikeyani, V. (2012). Ensemble Classification System for Scientific Chart Recognition from PDF Files. International Journal of Computer Vision and Image Processing (IJCVIP), 2(4), 1-10. http://doi.org/10.4018/ijcvip.2012100101

Chicago

Nagarajan, S., and V. Karthikeyani. "Ensemble Classification System for Scientific Chart Recognition from PDF Files," International Journal of Computer Vision and Image Processing (IJCVIP) 2, no.4: 1-10. http://doi.org/10.4018/ijcvip.2012100101

Export Reference

Mendeley
Favorite Full-Issue Download

Abstract

Portable Document Format (PDF) is the most frequently used universal document format on the Internet and E-Publishing. Wide usage of PDF files has increased the need of conversion tools that convert PDF file content to text or HTML formats. A PDF converter can be categorized into two domains, namely, text recognition and graphics recognition. This paper focus on graphic recognition, especially chart type identification, which is concerned with developing algorithms that has the ability to determine the type of a given chart image from a PDF file. In the proposed system, initially an enhanced connected component and statistical feature based method is used to separate the chart region from other regions. The chart region is then analyzed and grouped as either 2-dimensional or 3-dimensional chart. After separating the graphic component from the text components, feature extraction is performed. The features can be grouped as object features, texture features and shape features. The combined feature vector is then classified using ensemble classification system. Experimental results show that the chart separation, feature extraction and ensemble classification models significantly improve the quality of chart identification.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.