Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Hybridization Between Scoring Technique and Similarity Technique for Automatic Summarization by Extraction

Mohamed Amine Boudia, Amine Rahmani, Mohamed Elhadi Rahmani, Abdelatif Djebbar, Hadj Ahmed Bouarara, Fatima Kabli, Mohamed Guandouz

Source Title: International Journal of Organizational and Collective Intelligence (IJOCI) 6(1)

DOI: 10.4018/IJOCI.2016010101

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

To generate a summary automatically, the theory gives three approaches: by classification, by understanding or by extraction which is the most used and easy to implement. The current literature presents three basic techniques in the extraction approach: Extraction by scoring, Extraction by similarity and last but not least extraction by prototype. In previous work, the authors have always used one technique only and after that the proposed many manner to optimize the results: by the optimization algorithm or even they introduces the bio-inspired method to optimize the performance of automatic summarizers like ants or spider socials. Each technique has of weakness and strength point. In fact, the authors proposed in this work to use two technique one after the other to compensate the weakness of each technique by the strength of the second technique. In this paper, the authors will give a short state of art that will allow them later to explain the weakness and strength of each technique, after that they will explain their approach of Hybridization we will done.

Article Preview

Top

1. Introduction And Problematic

Day by day, the body of electronic textual information increases. It becomes increasingly difficult to access relevant information without using specific tools to access to the content of texts by rapid and effective means. Software engineering is more developed, we have not the same application generation problem, the hardware is also very developed, in our day personal machine are powerful. For this last has become a necessary task to find the specific method to access to the content of the texts.

A summary of a text is an effective way to represent the content of the texts, and allow quick access to their content. The proposition of an automatic summarization is to produce a short text covering the essential content of the source text. “We cannot imagine our daily life without summary” says Inderjeet Mani [Mani, 2001].

Headlines, the first paragraph of a newspaper article, newsletters, weather, tables of results of sports competitions and catalogues library are just the summary. Even in research, the author of the article must accompany their scientific papers with summaries (abstract) written by them.

We can use the automatic summaries to reduce the time and find the relevant documents or to reduce processing large text by identifying key information. The suggested procedure claims on the principle that high-frequency words in a document are important words” Luhn, H. P. (1958)

The current literature presents three approaches of automatic summarization:

•
Automatic Summarization by extraction: where we have three essential techniques: By Scoring, by Similarity or by prototype phrase. Edmundson, H. P. (1969) and Van Dijk, T. A. (1985).
•
Automatic Summarization by understanding: using method of semantics analysis. Salton, G., and al (1997) and Kintsch, W., & Van Dijk, T. A. (1978)
•
Automatic Summarization by automatic classification: using the method of bi-classification. Litvak, M., & Last, M. (2008, August)

In this paper we worked on automatic summarization by extraction, because it is a simple method to implement and gives good results; only that in previous works the summarization produced by extraction using a single technique at a time: Score, Similarity or sentence prototype.

Scoring gives generally good results, only that his weak point is its reduced ability to eliminate the phrase that is similar, in fact, if a sentence X passes the filter scoring a sentence Y which is similar to a point X will probably have a score that allows it to also pass the filter, which produces a repetitive sentence in the summary, which is logically false; secondly, the technical similarity has the strength to eliminate repetitive sentence, but its weakness is that it cannot ensure that the sentence is to keep a high weight, actually, as long as the sentence is greater the probability to have more similar phrase increases and we know that sentence is large tend to wear more information.

This work aims to use two techniques one after another, so that each one covers the technical point of weakness of the other and brings its power to the general approach, to see the impact of this proposition we experimented our approach and compared it with the result of using one technique.

Top

Automatic summarization appeared earlier as a field of research in computer science from the axis of NLP (automatic Natural Language Processing), HP Luhn [Luhn 1958] proposed in 1958 a first approach to the development of automatic abstracts from extracting phrases.

In the early 1960s, HP Edmundson and other participants in the project TRW (Thompson Ramo Wooldridge Inc) [Edmundson 1963] proposed a new system of automatic summarization where it combined several criteria to assess the relevance of sentences to extract.

These works were made to identify the fundamental ideas around the automatic summarization, such as problems caused by extraction to build summaries (problems of redundancy, incompleteness, break, etc..), the theoretical inadequacy of the use of statistics, or the difficulties to understand a text (from semantic analysis) to summarize.

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022)

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2012)

Volume 2: 4 Issues (2011)

Volume 1: 4 Issues (2010)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Hybridization Between Scoring Technique and Similarity Technique for Automatic Summarization by Extraction

Abstract

1. Introduction And Problematic

Complete Article List

Hybridization Between Scoring Technique and Similarity Technique for Automatic Summarization by Extraction

Abstract

1. Introduction And Problematic

2. Related Work

Complete Article List