Integration of Multi-Omics Data to Identify Cancer Biomarkers

Integration of Multi-Omics Data to Identify Cancer Biomarkers

Peng Li, Bo Sun
Copyright: © 2022 |Pages: 15
DOI: 10.4018/JITR.2022010105
Article PDF Download
Open access articles are freely available for download

Abstract

A novel method for integrating multi-omics data, including gene expression, copy number variation, DNA methylation, and miRNA data, is proposed to identify biomarkers of cancer prognosis. First, survival analysis was performed for these four types of omics data to obtain survival-related genes. Next, survival-related genes detected in at least two types of omics data were selected as candidate genes. The four types of omics data only composed of candidate genes were subjected to dimension reduction using an autoencoder to obtain a one-dimensional data representation. The mRMR algorithm was used to screen for key genes. This method was applied to lung squamous cell carcinoma and 20 cancer-related genes were identified. Gene function analysis revealed that the genes were related to cancer. Using survival analysis, the genes were verified to distinguish between high- and low-risk groups. These results indicate that the genes can be used as biomarkers for cancer.
Article Preview
Top

Introduction

Cancer, as a complex disease, is not only controlled by individual genes and genetic factors but is also related to the environment and living habits. These factors affect gene expression and thereby influence the occurrence and development of cancer. Biomarkers, such as genes, miRNAs, proteins, metabolites, are biological entities that can determine whether cells, tissues, or individuals are normal or have diseases (Ideker & Sharan, 2008). In the medical field, biomarkers can help diagnose diseases, predict disease development trends, predict the response of patients after treatment, and thus achieve precise and effective treatment for patients. To date, no effective diagnosis and treatment methods have been determined for many types of cancer. Therefore, identifying biomarkers that recognize the early characteristics of cancer and determining the mechanism of cancer occurrence and development are vital.

Traditional cancer biomarkers, such as carcinoembryonic antigens and tumor tissue images, can only detect cancer in the late stages and are not useful for the treatment of patients with cancer. The cure rate and survival rate in patients with cancer are relatively low. Therefore, early detection and timely treatment are necessary to improve these rates.

The emergence of next-generation sequencing technology has greatly accelerated cancer research. The use of gene expression data to identify cancer-related genes and biomarkers has accelerated the process of individualized treatment (Dancik, 2015). Some studies used gene expression data to distinguish between normal and tumor samples (Nannini et al., 2009). Other studies used gene expression data to detect different states of cancer development (van’t Veer et al., 2002; Klahan et al., 2016). However, because gene expression data often include small sample numbers and noise, using only gene expression data limits the discovery of new candidate cancer genes.

In general, gene expression can be regulated by heterogeneous multi-level regulatory factors such as copy number, DNA methylation, transcription factors, and miRNAs (Cancer Genome Atlas Research N, 2012; 2013). High-throughput sequencing can be performed to accurately obtain various biological data at various stages of organism development. These data are collectively referred to as multi-omics data (Reuter et al., 2015) and include multiple types of datasets, such as genomics, epigenomics, transcriptomics, proteomics, metabolomics, and microbiomics data. Using various omics techniques, we are able to understand diseases from a variety of perspectives. Many studies have used DNA methylation, micro RNA (miRNA), protein-protein interaction network (PPIN), or other data to identify cancer-related biomarkers (Zhao et al., 2017; Capper et al., 2018; Liu et al., 2017; Zhou et al., 2016; Wu et al., 2014). However, most methods do not effectively integrate multi-omics data to identify cancer-related genes and biomarkers. Although the use of single-omics data to identify cancer-related genes has yielded many valuable results, a single data source does not provide complete information for a gene, and the results are significantly affected by noise.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 15: 6 Issues (2022): 1 Released, 5 Forthcoming
Volume 14: 4 Issues (2021)
Volume 13: 4 Issues (2020)
Volume 12: 4 Issues (2019)
Volume 11: 4 Issues (2018)
Volume 10: 4 Issues (2017)
Volume 9: 4 Issues (2016)
Volume 8: 4 Issues (2015)
Volume 7: 4 Issues (2014)
Volume 6: 4 Issues (2013)
Volume 5: 4 Issues (2012)
Volume 4: 4 Issues (2011)
Volume 3: 4 Issues (2010)
Volume 2: 4 Issues (2009)
Volume 1: 4 Issues (2008)
View Complete Journal Contents Listing