Relative Relations in Biomedical Data Classification

Relative Relations in Biomedical Data Classification

Copyright: © 2023 |Pages: 11
DOI: 10.4018/978-1-7998-9220-5.ch161
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Advances in data science continue to improve the precision of biomedical research, and machine learning solutions are increasingly enabling the integration and exploration of molecular data. Recently, there is a strong need for “white box,” a comprehensive machine learning model that may actually reveal and evaluate patterns with diagnostic or prognostic value in omics data. In this article, the authors focus on algorithms for biomedical analysis in the field of explainable artificial intelligence. In particular, they present computational methods that address the concept of relative expression analysis (RXA). The classification algorithms that apply this idea access the interactions among genes/molecules to study their relative expression (i.e., the ordering among the expression values, rather than their absolute expression values). One then searches for characteristic perturbations in this ordering from one phenotype to another. They cover the concept of RXA, challenges of biomedical data analysis, and the innovations that the use of relative relationship-based algorithms brings.
Chapter Preview
Top

Background

Data mining is an umbrella term covering a broad range of tools and techniques for extracting hidden knowledge from large quantities of data. Biomedical data can be very challenging due to the enormous dimensionality, biological and experimental noise as well as other perturbations. In the literature, we will find that nearly all standard, off-the-shelf techniques were initially designed for other purposes than omics data (Bacardit, Widera, et. al. 2014), such as neural networks, random forests, SVMs, and linear discriminant analysis. When applied for omics data, the prediction models usually involve nonlinear functions of hundreds or thousands of features, many parameters, and are therefore constrain the process of uncovering new biological understanding that, after all, is the ultimate goal of data-driven biology. Deep learning approaches have also been getting attention (Min, Lee & Yoon, 2016) as they can better recognize complex features through representation learning with multiple layers and can facilitate the integrative analysis by effectively addressing the challenges discussed above. However, we know very little about how such results are derived internally. Such lack of knowledge discovery itself in those 'black box' systems impedes biological understanding and are obstacles to mature applications.

Key Terms in this Chapter

Multi-Omics Data: Data of different omic groups combined together during analysis. The different omic strategies employed during multi-omics are genome, proteome, transcriptome, epigenome, and microbiome.

Ordering Relation: The most typical relation in RXA that involve simple ordering relation between two (or more) genes that constitute rule e.g. where ( , ).

Top-Down Induction: A recursive method of decision tree generation. It starts with the entire input dataset in the root node where a locally optimal test for data splitting is searched and branches corresponding to the test outcomes are created. The test searches and data splitting are repeated in the created nodes unless the stopping condition is met.

Global Induction: A method of decision tree generation, where both the tree structure and all tests are searched at the same time; usually based on evolutionary approach in contrast to top-down induction.

Weight Relation: A more advanced relation in RXA that involve weight ordering relation between two (or more) genes that constitute rule e.g. where ( , ).

Decision Tree: A decision tree is a graph that uses a branching method to illustrate every possible outcome of a decision. Each branch of the decision tree represents a possible decision or occurrence. The tree structure shows how one choice leads to the next, and the use of branches indicates that each option is mutually exclusive.

Complete Chapter List

Search this Book:
Reset