Article Preview
Top1. Introduction
An increasing amount of data is being made available online. It can be exploited to inform data analytics and Decision Support Systems (DSS) for a variety of applications such as those belonging to the financial services domain. However, this online data is diverse in terms of volume and complexity, is largely unstructured and constructed in natural human languages. This makes the manual exploitation of this data by end users very difficult. Therefore, automated Information Extraction (IE) techniques are needed in order to extract useful information to be represented in a machine understandable semantic model. However, the task of transforming the largely informative unstructured text into structured knowledgebase that can be reasoned upon to infer new knowledge or predictions or decisions of interest to a specific beneficiary group is very complex. Addressing that complexity requires in-depth expertise in utilising and integrating various methods and technologies associated with Natural Language Processing (NLP), knowledge representation and Machine Learning (ML). Recently, the advantage of the achievements in the field of Semantic Web Technologies (SWT) have been extensively used in data analytics and decision-support systems in several application domains such as financial investment recommendation, a clinical management, system audit management, network security management, justice and legal advice, waste-water management, power consumption management and electronic issue management.
As a result, there is a pressing need for a comprehensive framework that offers an intelligent roadmap for aligning the discrepancies in knowledge presentation by various contributing information sources and deliver intelligent query methods against that extracted information and its semantic model. Such framework, in authors’ view, should benefit from knowledge of the problem domain that can assist the fundamental tasks of NLP, which are Named Entity Recognition (NER) and Relation Extraction.
Domain Knowledge is knowledge about a specific field/domain of interest or subject that are understood by practitioners in that field/domain of expertise. Compiling this knowledge requires in-depth analysis of the problem domain characteristics. These characteristics could be about the grammar and the meaning of words in the context of a sentence structure or style of the language of the domain. It is crucial to comprehend these characteristics to allow engineering them as linguistic or structural features. These features can then employed in the implementation of IE systems using a variety of approaches such as rule-based or ML based (Aljamel, 2018).
In this paper, a knowledge-based framework is proposed that is based on the authors extensive research and development efforts in building a knowledge-driven financial recommender system (Aljamel, 2018). The framework adopts SWT for domain knowledge representation because they can be utilised to represent the problem domain in a highly structured knowledge model (ontology) that enables software agents to comprehend domain-related information, and thus assist in automating the extraction of concepts and relations of relevance to the domain-of-interest. The semantic ontology is formally expressed using the standardised Semantic Web languages, which are Resource Description Framework (RDF), RDF Schema (RDFS) and Web Ontology Language (OWL), and facilitate the inference of new facts from the extracted and semantically-tagged information to support decision-making and knowledge exploration activities. Furthermore, because the targeted domain-specific knowledge is heavily represented by non-binary relations, the authors have investigated how to represent these relations in the domain-specific ontology model by using N-ary relation patterns (Hogan, 2020).
The proposed knowledge-based framework presents a comprehensive methodology for IE and exploration that comprises the processes of analysis and modelling of the domain knowledge, extracting information from unstructured data, constructing the semantic knowledgebase, enriching the semantic knowledgebase and lastly exploiting the resulting semantic Knowledgebase by intelligently exploring and processing it to support the decision making. Delivering these processes requires the integration of several diverse technologies including NLP, knowledge representation, ML and, evolutionary optimisation algorithms.