sl-LSTM: A Bi-Directional LSTM With Stochastic Gradient Descent Optimization for Sequence Labeling Tasks in Big Data

sl-LSTM: A Bi-Directional LSTM With Stochastic Gradient Descent Optimization for Sequence Labeling Tasks in Big Data

Nancy Victor (Vellore Institute of Technology, India) and Daphne Lopez (Vellore Institute of Technology, India)
Copyright: © 2020 |Pages: 16
DOI: 10.4018/IJGHPC.2020070101

Abstract

The volume of data in diverse data formats from various data sources has led the way for a new drift in the digital world, Big Data. This article proposes sl-LSTM (sequence labelling LSTM), a neural network architecture that combines the effectiveness of typical LSTM models to perform sequence labeling tasks. This is a bi-directional LSTM which uses stochastic gradient descent optimization and combines two features of the existing LSTM variants: coupled input-forget gates for reducing the computational complexity and peephole connections that allow all gates to inspect the current cell state. The model is tested on different datasets and the results show that the integration of various neural network models can further improve the efficiency of approach for identifying sensitive information in Big data.
Article Preview
Top

Named Entity Recognition

Named Entity Recognition refers to an interesting information extraction technique in the area of machine learning, with the help of which certain types of entities can be identified using annotations. This plays a major role in giving solutions to real world queries such as whether a tweet mentions a particular person's name or location, to find the sentiments about a particular product etc. Figure 1 shows the various Named Entity Recognition and Information Extraction (IE) techniques (Christopher, Prabhakar, & Hinrich, 2008).

Figure 1.

NER techniques

IJGHPC.2020070101.f01

The efficiency of the NER approach can be evaluated using the following measures (Powers, 2011). TP, FP and FN refer to “True positives”, “False positives” and “False negatives,” respectively.

Precision (P): This refers to the ratio of correctly predicted entities to all the entity predictions.

IJGHPC.2020070101.m01
(1)

Recall(R): This refers to the ratio of correctly predicted entities to all the real entities.

IJGHPC.2020070101.m02
(2)

F-Measure(F): This is the harmonic mean of both precision and recall; which helps to average ratios in a suitable manner.

IJGHPC.2020070101.m03
(3)

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 13: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 12: 4 Issues (2020): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing