Genetic Code and Stochastic Matrices

Genetic Code and Stochastic Matrices

Sergey Petoukhov (Russian Academy of Sciences, Russia) and Matthew He (Nova Southeastern University, USA)
DOI: 10.4018/978-1-60566-124-7.ch005


In this chapter, we first use the Gray code representation of the genetic code C = 00, U = 10, G = 11, and A = 01 (C pairs with G, A pairs with U) to generate a sequence of genetic code-based matrices. In connection with these code-based matrices, we use the Hamming distance to generate a sequence of numerical matrices. We then further investigate the properties of the numerical matrices and show that they are doubly stochastic and symmetric. We determine the frequency distributions of the Hamming distances, building blocks of the matrices, decomposition and iterations of matrices. We present an explicit decomposition formula for the genetic code-based matrix in terms of permutation matrices. Furthermore, we establish a relation between the genetic code and a stochastic matrix based on hydrogen bonds of DNA. Using fundamental properties of the stochastic matrices, we determine explicitly the decomposition formula of genetic code-based biperiodic table. By iterating the stochastic matrix, we demonstrate the symmetrical relations between the entries of the matrix and DNA molar concentration accumulation. The evolution matrices based on genetic code were derived by using hydrogen bondsbased symmetric stochastic (2x2)-matrices as primary building blocks. The fractal structure of the genetic code and stochastic matrices were illustrated in the process of matrix decomposition, iteration and expansion in corresponding to the fractal structure of the biperiodic table introduced by Petoukhov (2001a, 2001b, 2005).
Chapter Preview

Introduction And Background

The universal genetic code may be viewed as the mapping of nucleic acids into polypeptides that is employed in every organism, organelle and virus with some minor variations. A mathematical view of genetic code is a mapg: C ® A, expression (1) where C = {(x1x2x3): xiR = {A, C, G, U}} denotes the set of codons and A = {Ala, Arg, Asp, …, Val, UAA, UAG, UGA} denotes the set of amino acids and termination codons. Genetic determinism, which presents the belief that we are controlled by our genes and that no other factor is significant, is now all-pervasive. This viewpoint is emphasized by the statement: “life is a partnership between genes and mathematics” (Stewart, 1999, p. xi).

We recall some basic definitions of a stochastic matrix. A square matrix of P = (pij) is a stochastic matrix if all entries of the matrix are nonnegative and the sum of the elements in each row (or column) is unity or a constant. If the sum of the elements in each row and column is unity or the same, the matrix is called doubly stochastic. The term “stochastic matrix” goes back at least to Romanovsky (1931). It plays a large role in the theory of discrete Markov chains. Stochastic matrices and doubly stochastic matrices have many remarkable properties. For example the Birkhoff–von Neumann Theorem says that every doubly stochastic matrix is a convex combination of permutation matrices of the same order and the permutation matrices are the extreme points of the set of doubly stochastic matrices. The properties of stochastic matrices are mainly spectral theoretic and are motivated by Markov chains. Doubly stochastic matrices have additional combinatorial structure.

The so called Gray code is one of the most famous in the theory of signal processing. The Gray code was used in a telegraph demonstrated by French engineer É. Baudot in 1878. The codes were first patented by F. Gray in 1953. The Gray code is a binary code in which consecutive decimal numbers are represented by binary expressions that differ in the state of one, and only one, bit. Gray codes have been extensively studied in other contexts. For example, Gray codes have been used in converting analog information to digital form. Here we review briefly how to construct a Gray code for each positive integer n. One way to construct a Gray code for n bits is to take a Gray code for (n-1) bits with each code prefixed by 0 (for the first half of the code) and append the (n-1) Gray code reversed with each code prefixed by 1 (for the second half). This is called a “binary-reflected Gray code”. Figure 1 is an example of creating a 3-bit Gray code from a 2-bit Gray code.

Figure 1.

Creating a 3-bit Gray code from a 2-bit Gray code

A Gray code representation of the genetic code was proposed in the work (Swanson, 1984). A representation of the genetic code as a six-dimensional Boolean hypercube was proposed in (Jimenéz-Montaño, Mora-Basáñez, & Pöschel, 1994). In (Štambuk, 2000), universal metric properties of the genetic code were defined by means of the nucleotide base representation on the square with vertices U or T = 0 0, C = 0 1, G = 1 0 and A = 1 1. It was shown that this notation defines the Cantor set and Smale horseshoe map representation of the genetic code. The “Biperiodic table of the genetic code” [C A; U G](3) (Figure 3 in Chapter 1), which has demonstrated an important symmetrical structure and has led to many discoveries, was introduced in (Petoukhov, 2001a, 2001b, 2005). This chapter describes stochastic characteristics of the biperiodic table on the basis of their original investigations and considerations in the works (He, 2001, 2003a, 2003b; He, Petoukhov, & Ricci, 2004).

Complete Chapter List

Search this Book: