Cyber Security Event Sentence Detection From News Articles Based on Trigger and Argument

Cyber Security Event Sentence Detection From News Articles Based on Trigger and Argument

Nikhil Chaturvedi (Shri Vaishnav Vidyapeeth Vishwavidyalaya, India) and Jigyasu Dubey (Shri Vaishnav Vidyapeeth Vishwavidyalaya, India)
DOI: 10.4018/978-1-6684-6444-1.ch012
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Events are critical for comprehending the things that occur in the actual world. The term “events” is frequently used to describe the numerous relationships between people, places, activities, and things. Events-centered modelling entails the representation of several facets of an event in addition to the semantic representation of event facts. Detecting cybersecurity occurrences is important to keep us aware of the rapidly increasing number of such incidents reported via text. The authors focus on cyber security event detection task in this study, specifically on identifying event trigger words and arguments in the cybersecurity area. For this study, they use the CASIE dataset. They propose a system that involves the events identification, event triggers identification, and event arguments extraction. In this section, they divide the cyber security event sentence classification model into two steps: event trigger and argument identification, and cyber security event sentence classification using the training corpus.
Chapter Preview
Top

Introduction

The internet is a necessary aspect of our life. It enables us to obtain real-time information from any location with a network connection. The procedure for acquiring Real-time or historical data has evolved and gotten much easier with the emergence of online newspapers. Numerous news agencies compete with one another to deliver a superior service to attract consumers, which benefits the customers by obtaining more accurate and useful information.

Cyber thieves, terrorists, and state-sponsored spies use the Dark Web (Ji and Grishman 2008) to achieve their illicit goals because it is one of the most difficult and untraceable mediums. Cybercrime on the Dark Web is similar to criminality in the real world. The sheer breadth, unpredictable environment, and anonymity given by Dark Web sites, on the other hand, are crucial battlegrounds in tracing criminals. Evaluating the various Dark Web crime threats is a vital step in discovering potential remedies to cyber-crime.

Cyber-attacks and cybercrime are widespread in the modern period, and their frequency and severity are expected to grow. Additionally, they are being created to capitalise on emerging threats and surroundings, such as the IOT and cyber systems. People and systems will be better equipped to defend themselves against assaults if we can stay current on trends and vulnerabilities.

We have proposed a methodology for extracting cyber security events from online news articles. The task of detecting instances of certain events in text and extracting pertinent information from them has been researched extensively over a long period of time yet continues to be a difficult one. The task was established at the second Message Understanding Conference in 1989 and has been used in a range of formation extraction tasks ever since; for a history of MUC, see (Grishman and Sundheim 1996). Efforts to improve event detection performance have often focused on increasing additional features or optimizing pattern matching algorithms (Li, Ji, and Huang 2013) or on building neural networks that better capture in formation, such as dependency tree based CNNs (Nguyen and Grishman 2018). Multiple sources of external knowledge have been leveraged to overcome the data scarcity of labelled data, including semantic frame analyses (Liu et al. 2016 and Li et al. 2019), (Chen et al. 2017), consistent along with complementary information for disambiguation from multilingual data (Liu et al. 2018), and expert-level patterns from an open-source pattern-based event extraction system called TABARI (Cao et al. 2018).

The majority of earlier research on event detection has been conducted on frequent occurrences in a person's life, such as those described by ACE (Walker et al. 2006) or the TAC Knowledge Base Population (Mitamura et al.,2015). These life events include “birth,” “marriage,” “beginning a career,” and “being charged with a crime.” One critical distinction between extracting life events and cybersecurity events is the domain-specific skills required. We have the same issue when obtaining information about other domains, such as biological events. The creation of these extensive event extraction datasets and tasks has aided in the advancement of BioNLP (Kim et al. 2009).

The intrinsic complexity of cyber security events is a second distinction between extracting life events and cyber security events. A cyber-attack event might consist of a series of attempted or accomplished activities. Each of these activities might be regarded as a distinct cyber security event description, increasing the number of possible cyber security event references. In comparison to real-world events, the difficulty lies in determining the homonym and synonym sets associated with an event mention.

In our work, we use the CASIE dataset (Satyapanich et al. 2019), this is already trained corpus of one thousand English online news articles published between 2017 to 2019 were includes event-based annotations on cyber-attack and vulnerability-related incidents They identify and specify five cyber security events, as well as their semantic roles, as well as twenty-four sorts of arguments that could be used as role-fillers. They propose a novel, difficult corpus of news stories that are annotated with information about cyber security.

While earlier research on cyber attack event analysis has been conducted (Qiu et al., 2016)(Khandpur et al. 2017), this research, to our knowledge, includes the broadest variety and complexity of cyber security incidents. We begin by providing definitions of cyber security events and common terminology utilized in our event detection job. Then we define our model's overall design and each of its components. After that, we present our assessment and experimental findings, as well as our current and future research.

Complete Chapter List

Search this Book:
Reset