Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Hybrid Approach for Single Text Document Summarization Using Statistical and Sentiment Features

Chandra Shekhar Yadav, Aditi Sharan

Source Title: International Journal of Information Retrieval Research (IJIRR) 5(4)

DOI: 10.4018/IJIRR.2015100104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Summarization is a way to represent same information in concise way with equal sense. This can be categorized in two type Abstractive and Extractive type. Our work is focused around Extractive summarization. A generic approach to extractive summarization is to consider sentence as an entity, score each sentence based on some indicative features to ascertain the quality of sentence for inclusion in summary. Sort the sentences on the score and consider top n sentences for summarization. Mostly statistical features have been used for scoring the sentences. A hybrid model for a single text document summarization is being proposed. This hybrid model is an extraction based approach, which is combination of Statistical and semantic technique. The hybrid model depends on the linear combination of statistical measures: sentence position, TF-IDF, Aggregate similarity, centroid, and semantic measure. The idea to include sentiment analysis for salient sentence extraction is derived from the concept that emotion plays an important role in communication to effectively convey any message hence, it can play a vital role in text document summarization. For comparison, five system summaries have been generated: Proposed Work, MEAD system, Microsoft system, OPINOSIS system, and Human generated summary, and evaluation is done using ROUGE score.

Article Preview

Top

1. Introduction

Text document summarization playing an important role in IR (Information Retrieval) because, it condense a large pool of information into a concise form, through selecting the salient sentences and discards redundant sentences (or information) and we termed it as summarization process.

Radev et al.(2002) has defined a summary as ary as “a text that is produced from one or more texts that convey important information in the original texts, and that is no longer than half of the original text and usually significant less than that”. As explained by Alguliev et al. (2011) Automatic text document summarization is an interdisciplinary research area of computer science that includes AI (artificial intelligence), Data Mining, Statistics as well as Psychology. We can classify text doc summarization in two ways (by techniques) Abstractive summarization and Extractive summarization. Abstractive summarization is more human like a summary, which is the actual goal of Text document summarization. As defined by Mani, I., & Maybury, M. T. (1999), Wan (2008) abstractive summarization needs three things as Information Fusion. Sentences Compression and Reformation. Abstractive summarization may contain new sentences, phrases, words even which are not present in the source document. Although till now a lot of research in happened in the last decades in the area of NLP (Natural language processing), NLG (Natural Language Generation), so much computing power increased, but still we are not near for abstractive summarization. The actual challenge is a generation of new sentences, new phases, along with produced summary must retain the same meaning as the same source document has. Extractive summarization based on extractive entities, entities may be sentence, sub part of sentence, phrase or a word. Our work is focused on extractive based technique.

In This paper we are proposing a hybrid method for single text document summarization, which is linear combination of statistical features as used in Ko. Y. & Seo, J (2008), Yeh, J. Y. et al. (2005), Radev, D.R. et al (2002, and Radev, D. R. (2001) ] and a new kind of semantic feature i.e. sentiment analysis. The idea which is used in this paper has been derived from different papers like for statistical features and their collective sum obtained from Ko, Y., & Seo, J. (2008), Yeh, J. Y. et al. (2005), centroid measure are taken from Radev, D. R. et al. (2001), Radev, D. R. et al. (2004) [7,6]. To include sentiment analysis is derived from the concept that emotion plays an important role in communication to effectively convey any message hence, it can play a vital role in text document summarization.

Outline of paper looks like, in section 2 we are presenting categorized literature work done in recent years, section 3 contains features used for summarization purpose, section 4 contain summarization algorithm and detail approach, in section 5 we are presenting corpus description with statistical and linguistic statistic, section 6 showing some experiments and results, in section 7 is about conclusion.

Top

According to Aliguliyev, R. M. (2007) summarization is defined as a three steps process (1) Analysis of text. (2) Transformation- as summary representation, and (3) Synthesis- produce an appropriate summary. E Hovy, E., & Lin, C. Y. (1998) introduced SUMMARIST system to create a robust text summarization system, system that works on three phases which can describe in form of an equation like “Summarization = Topic Identification + Interpretation + Generation”.

A lot of research done in the direction of Extraction based approaches. In extractive summarization the important the task is to find informative sentences, a subpart of sentence or phrase and include these extractive elements into the summary. Here we are presenting work done in two categories (1) early work done and, (2) work done in recent years. In our views these are three works done initially, that provides direction of Text Document Summarization (Extractive), explained below

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024)

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Hybrid Approach for Single Text Document Summarization Using Statistical and Sentiment Features

Abstract

1. Introduction

Complete Article List

Hybrid Approach for Single Text Document Summarization Using Statistical and Sentiment Features

Abstract

1. Introduction

2. Related Work

Complete Article List