Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Traditional Classifiers vs. Deep Learning for Cyberbullying Detection

Source Title: Automatic Cyberbullying Detection: Emerging Research and Opportunities

DOI: 10.4018/978-1-5225-5249-9.ch006

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In this chapter, the authors present their approach to cyberbullying detection with the use of various traditional classifiers, including a deep learning approach. Research has tackled the problem of cyberbullying detection during recent years. However, due to complexity of language used in cyberbullying, the results obtained with traditional classifiers has remained only mildly satisfying. In this chapter, the authors apply a number of traditional classifiers, used also in previous research, to obtain an objective view on to what extent each of them is suitable to the task. They also propose a novel method to automatic cyberbullying detection based on convolutional neural networks and increased feature density. The experiments performed on actual cyberbullying data showed a major advantage of the presented approach to all previous methods, including the two best performing methods so far based on SO-PMI-IR and brute-force search algorithm, presented in previous two chapters.

Chapter Preview

Top

Proposed Methods

Below we describe the details of the applied methods. Firstly, we describe basics of data preprocessing and feature extraction. Next, we shortly explain all classifiers with their settings and modification applied in the experiments, including the proposed model based on CNN.

Data Preprocessing

The sentences from the original dataset used in this (Ptaszynski et al., 2010, 2015a, 2015b, 2016; Nitta et al., 2013) were preprocessed in the following ways:

•
Tokenization: All words, punctuation marks, etc. are separated by spaces (later: TOK).
•
Lemmatization: Like the above but the words are represented in their generic (dictionary) forms, or “lemmas” (later: LEM).
•
Parts of Speech: Words are replaced with their representative parts of speech (later: POS).
•
Tokens With POS: Both words and POS information is included in one element (later: TOK+POS).
•
Lemmas With POS: Like the above but with lemmas instead of words (later: LEM+POS).
•
Tokens With Named Entity Recognition: Words encoded together with with information on what named entities (private name of a person, organization, numericals, etc.) appear in the sentence. The NER information is annotated by CaboCha (later: TOK+NER).
•
Lemmas With NER: Like the above but with lemmas (later: LEM+NER).
•
Chunking: Larger sub-parts of sentences separated syntactically, such as noun phrase, verb phrase, predicates, etc., but without dependency relations (later: CHNK).
•
Dependency Structure: Same as above, but with information regarding syntactical relations between chunks (later: DEP).
•
Chunking With NER: Information on named entities is encoded in chunks (later: CHNK+NER).
•
Dependency Structure With Named Entities: Both dependency relations and named entities are included in each element (later: DEP+NER).

Five examples of preprocessing were represented in Table 2 in Chapter 5. Theoretically, the more generalized a sentence is, the less unique and frequent patterns it will contain, but the produced patterns will be more frequent. For example, in the sentence from Table 2 in Chapter 5 we can see that a simple phrase kimochi_ii hi (“pleasant day”) is represented by a POS pattern as ADJ N. We can easily assume that there will be more ADJ N patterns than kimochi_ii hi, because many word combinations can be represented by this pattern. We compared the results of classification for each classifier using different preprocessing methods to find out whether it is better to represent sentences as more generalized or as more specific. The generalization is also closely related to the notion of Feature Density we propose to optimize the proposed method.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Traditional Classifiers vs. Deep Learning for Cyberbullying Detection

Abstract

Proposed Methods

Data Preprocessing

Complete Chapter List