A Framework for Interfacing Unstructured Data Into Business Process From Enterprise Social Networks

A Framework for Interfacing Unstructured Data Into Business Process From Enterprise Social Networks

Amjed Al-Thuhli (Sultan Qaboos University, Oman), Mohammed Al-Badawi (Sultan Qaboos University, Oman), Youcef Baghdadi (Sultan Qaboos University, Oman) and Abdullah Al-Hamdani (Sultan Qaboos University, Oman)
DOI: 10.4018/978-1-5225-8182-6.ch035
OnDemand PDF Download:
List Price: $37.50
10% Discount:-$3.75


The increased number of Enterprise Social Networks (ESN) business applications has had a major impact on organizations' business processes improvements by allowing the involvement of human interactions to these process. However, these applications generate unstructured data which create barriers and challenges to offering the data in the form of web services in a SOA environment, which again impacts negatively the business process. In this context, the authors propose a framework to interface ESN unstructured data into BP using text mining techniques. The Term frequency-inverse document frequency is used as a weighting schema in this framework. After that, the cosine similarity and k-mean are utilized to find similar values from different documents and cluster documents into groups respectively. The result of the evaluation of the framework shows promising results for retrieving social unstructured data. These results can be published into the SOA enterprise service bus using the RESTful web services.
Chapter Preview

In recent years many research has been conducted in the field of text mining and social networks. Luo et al. (2016) introduced the double reading (DRESS) system by utilizing cloud-based technologies. They developed multiple subsystems to extract structured data elements from unstructured medical records. They developed three subsystems (LinkMR, LinkCore and LinkQC). These subsystems provide data acquisition, data security and data sharing. The authors validated their research with Kappa statistics to measure the blinded reproducibility discrete variables and continuous variable. As a result, DRESS shows a high reproducibility of 98% out of 100 patients study. It turns unstructured data into semi-structured big data by processing thousands of medical records in a few days.

The study conducted by Hong et al. (2016) used the cTAKES clinical Natural Language Processing (NLP) tool to annotate events and detect sentences. They developed a framework to transform free text diagnostic criteria into a structured Quality Data Model. Attributes of clinical diagnosis events were annotated and classified by a machine learning algorithm which is based on Conditional Random Fields (CRF). The authors hypothesize that their approach can be generalized to other clinical knowledge. However, their work was limited to specific experiments of diagnostic criteria and they followed cTRAKES default configuration which affected the performance of converting free text into structured formats.

Sampathkumar, Chen and Luo (2014) used text mining to extract knowledge from unstructured text data. They present a system with three main modules: Information retrieval module, text processing module and information extraction module. The Information retrieval module is responsible for extracting useful information from data sources such as online forums using a web crawler. The text processing module processes the text data using Natural Language Processing (NLP) tools. However, the information extraction module is used to extract a relationship between entities of interest. The authors use Hidden Markov Model based text mining system to forecast the relationship between a drug and an adverse side-effect.

Complete Chapter List

Search this Book: