Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Agents Oriented Genetic-K-Means (AOGK) System for Plagiarism Detection

Hadj Ahmed Bouarara, Yasmin Bouarara

Source Title: Scholarly Ethics and Publishing: Breakthroughs in Research and Practice

DOI: 10.4018/978-1-5225-8057-7.ch014

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In the last decade, the plagiarism phenomenon has widely spread and become a topical problem in the modern scientific world, caused by the wide availability of electronic documents online and offline. This work will be devoted to describe a new plagiarism detection system named AOGK « Agents Oriented Genetic-K-means » based on a multi-agents architecture composed of three modules: text parsing to transform documents into vectors; Learning module using genetic algorithms to build a prediction model; Test module using k-means for the final classification of suspicious document; To evaluate their system the authors have used a range of reference metrics (precision, recall, f-measure and entropy) and the benchmark PAN 09. They have compared the results obtained with the performance of other systems found in literature; the authors' aim is the preservation of copyright.

Chapter Preview

Top

1. Introduction And Background

Recently, the internet was continually enriched by new contents, where Google web search engine contains more than 3 billion of web pages providing a wide variety of free source texts in many different languages. Unfortunately, the electronic documents are vulnerable to being copied and the cases of plagiarism have been increased tremendously in the last few years. It is a big problem in the scientific community, which represents the reuse of ideas, words, images or expressions of others persons without making citations (Basile, 2009). We can found different forms of plagiarism such as:

•
Verbatim Plagiarism: Copying directly sentences or passages from the work of other person.
•
Paraphraser: Using the same sentences of another person, by changing the order of the words.
•
Blunt Plagiarism (Copyright Plagiarism): Stealing the work of another and put another name to it.
•
Plagiarism of Ideas: The reuse of an original thought or idea (independent of the form) from a source text.
•
Plagiarism with Synonym: COPYING the same words of someone and replacing them by their synonyms.

Existing approaches to safeguarding intellectual properties can be roughly categorized into two-families:

•
The Supervised Plagiarism Detection (SPD): The SPD is based on the external information. It allows comparing the content of each suspicious document (document to be analysed) against a repository of reference documents (external information) in order to detect the plagiarised parts (Stein, 2007).
•
The Unsupervised Plagiarism Detection (UPD): The UPD does not require the use of reference documents. It is based on the stylometry technique to analyse the content of the suspicious document to detect the change in the styles between the paragraphs. It is very difficult to achieve because an author can have different styles.

Nowadays, several classical plagiarism detection systems have seen the light, but they are face to many drawbacks in terms of (Quality of detection, the parameters selected (similarity measure and text representation method), response time and presentation of results).

We are thus naturally led to seek better performance by using a decentralized architecture. Our work consists to deal with all the problems previously cited through:

•
The development of a new system for plagiarism detection based on the hybridization between meta-heuristic method (genetic algorithm) and machine learning method (k-means).
•
Using a set of agents to orient the final decision by a vote.
•
Verify the effectiveness of our system through a comparative study with the works existed in literature.
•
Construct a visualization method, which allows us to interact with the system to retrieve decisions and have a global view / detail of the detection results.
•
Help the scientific world to limit the phenomenon of plagiarism.

Our paper takes place in the intersection of different domain like shown in Figure 1.

Figure 1.

Positioning of our problem

The paper is organized as follows: Section 2 describes some automatic plagiarism systems. Section 3 gives a detailed view around the HGK-PD system. Section 4 exposes the obtained results after the different tests realized on the PAN 09 dataset. Section 5 presents a comparative study between our system and others plagiarism detection systems existed in literature. Finally in section 6, we conclude the paper and give some future perspectives.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Agents Oriented Genetic-K-Means (AOGK) System for Plagiarism Detection

Abstract

1. Introduction And Background

Complete Chapter List