Article Preview
TopIntroduction
The development of Artificial Intelligent (AI) technologies makes our daily life much easier than before. For instance, location-based mobile applications help us to find the nearest parking space to your favourite restaurant. In the business domain, BI (Business Intelligent) services assist us to make correct business decisions. In the healthcare domain, AI technologies start to show the strength of detecting diseases in the early stages to minimize the risks of further development. We will see AI technologies becoming one of the most critical future development areas to enhance human healthcare. As a result, symptoms and lifestyle-based disease research and pre-diagnose applications started to show great potential to facilitate self-health care intelligent systems. However, many research problems remain at the current fast AI implementation trends that apply cutting edge technologies such as Deep Learning algorithms and Nature Language Processing (NLP). Some key issues are (but are not limited to): data trust/quality (European Union Agency for Fundamental Right, 2019), security (Pardeep, Masud, Gaba & et al., 2021) transparent prediction (Knight, 2017), and importantly the Causal Analysis (Vorhies, 2019). In contrast to other general domain applications, these open issues are crucial in the healthcare domain. For instance, 'children eating breakfast will avoid teen obesity' (Warner, 2008) and 'eating yoghurts would reduce 19% chances of growing precancerous but only in adenomas for the man (Zheng, Wu, Song, Ogino, Fuchs & Chan, 2019). Both studies only explained associations/correlations discovered from the data observations. However, there is no evidence to tell the possible reasons for 'why'. Recently, causality research in the ML community evidenced that the causal machine learning approach can improve the accuracy of medical diagnosis (Richens, Lee & Johri, 2020). Therefore, knowledge extraction and modelling should be considered as an important step to enhance ML outcomes and expandability, not just focusing on raw data engineering. With the Semantic Web/Knowledge Graph research community growing, knowledge data becomes available and their semantic representations in the semantic cloud. We believe there are enough semantic resources to deal with causality inference and transparent probability calculations collaborating with ML algorithms. For example, we can build Semantic Knowledge Base (SKB) representing relations among symptoms, affecting anatomical structures, most affected groups (age, gender, location), lifestyle effects and drug side effects to a particular group of diseases. Our research work presented in the paper is motivated by such ideas and case studies.
This paper has its distinct contribution to developing a novel semantic modelling framework to generate causality and probability graphs from healthcare information on the Web. Then, the causality knowledge graph data will support more advanced knowledge-based data analysis to address trust, transparency and causality analysis issues. In addition, this paper is a further extension and detailed explanation of the early research outcomes published in (Yu, 2020 & Yu, 2021). The major extension includes merging two separated research methodologies to provide a more inclusive view of the proposed framework and more data evaluations.
The current ML methods and applications for disease recommendation will be reviewed and discuss their critical limitations in section 2. Section 3 will illustrate the proposed framework and its components. Section 4 will demonstrate the benefits of applying the proposed framework in the healthcare AI research domain with our experimental and evaluation results. The conclusion and future work will be drawn at the last section.