Biological Background

Biological Background

DOI: 10.4018/978-1-60960-557-5.ch001
OnDemand PDF Download:
No Current Special Offers

Chapter Preview


A Little Bit Of Biology

As this book treats microarray gene expression based cancer classification as an application, let us briefly consider biological aspects of this task.

Let us first recall the “central dogma” of biology as it is described in (Cohen, 2007): deoxyribonucleic acid or DNA is a nucleic acid that stores genetic information needed for the development and functioning of all living beings; DNA is used to construct proteins in the following way: a section of DNA called a gene is transcribed to a molecule called a messenger1 RNA (ribonucleic acid) or mRNA and then translated into a protein by a ribosome; proteins carry out most functions of cells such as regulation of translation and transcription and DNA replication. DNA and mRNA molecules are sequences of four different nucleotides. Proteins are sequences of twenty different amino acids.

After the protein is constructed, the gene is said to be expressed. Transcription and translation can be considered as a kind of transformations, one of which is applied to DNA while another one is applied to mRNA. The entire process can be expressed as follows: DNA 978-1-60960-557-5.ch001.m01 mRNA 978-1-60960-557-5.ch001.m02 protein. As you can see, genes being the essential parts of DNA play the important role in this process. Gene expression is often viewed as the process of the protein synthesis (though proteins are not the only possible products of gene expression). By monitoring gene expression, one can get an indirect estimate of protein abundance, which is important for determining biological function.

The expressed genes within mammalian cells can be divided into housekeeping and tissue-specific ones (Weinberg, 2007). Housekeeping genes are responsible for maintaining viability of all cell types in the body; they carry out biological functions common to all cell types. On the other hand, the tissue-specific genes produce proteins that are specifically associated with a given tissue.

Microarray technology helps to get the expression levels of many genes at once. It is thanks to this technology2 that we are flooded nowadays with a plenty of such measurements, however, done for a few samples (the reason for this will be discussed in the next chapter).

A DNA microarray is an array of thousands of locations, each containing DNA for a different gene (Cohen, 2007), (Li, Tseng, & Wong, 2003). This array or (glass, plastic) slide contains a large library of thousands of single stranded cDNA (complementary DNA, i.e. DNA complementary to mRNA) clones (probes), corresponding to different genes, i.e. each spot on the slide corresponds to a specific gene. A typical use of microarrays is to extract two mRNA samples from two cell cultures or tissues (e.g., normal and cancerous), separately reverse transcribe them into cDNA, and using fluorescence labeling, dye the cDNA in these samples red (for the sample extracted from cancerous tissue) and green (for the sample extracted from healthy tissue), respectively (Cohen, 2007). Both samples are then spread across the microarray and left to hybridize to their corresponding complementary cDNA (labeled cDNA try to bind to their complementary cDNA on the microarray in order to form a double stranded molecule in the process called hybridization). Hybridization thus acts like a detector of the presence of a certain gene. The slide is then scanned to obtained numerical intensities of each dye. The result of scanning is an image. Finally, image processing is used to find out the color at each location of the array. The four variants are possible: if genes are expressed in both samples, the color will be yellow; if genes are expressed in neither sample, the color will be black; if genes are only expressed in one sample, the color will be either red or green. The intensity of a color indicates the level of expression, i.e. the number of mRNA transcribed. Given two differently dyed cDNA, the goal is to compare the intensity values 978-1-60960-557-5.ch001.m03 and 978-1-60960-557-5.ch001.m04 of the red and green channels at each spot of the microarray. The most popular statistic is the intensity log-ratio: 978-1-60960-557-5.ch001.m05 (Kohane, Kho, & Butte, 2003), (Speed, 2003), (Drăghici, 2003).

Complete Chapter List

Search this Book: