Bioinformatics is a new, rapidly expanding field that uses computational approaches to answer biological questions (Baxevanis, 2005). These questions are answered by means of analyzing and mining biological data. The field of bioinformatics or computational biology is a multidisciplinary research and development environment, in which a variety of techniques from computer science, applied mathematics, linguistics, physics, and, statistics are used. The terms bioinformatics and computational biology are often used interchangeably (Baldi, 1998; Pevzner, 2000). This new area of research is driven by the wealth of data from high throughput genome projects, such as the human genome sequencing project (International Human Genome Sequencing Consortium, 2001; Venter, 2001). As of early 2006, 180 organisms have been sequenced, with the capacity to sequence constantly increasing. Three major DNA databases collaborate and mirror over 100 billion base pairs in Europe (EMBL), Japan (DDBJ) and the USA (Genbank.) The advent of high throughput methods for monitoring gene expression, such as microarrays (Schena, 1995) detecting the expression level of thousands of genes simultaneously. Such data can be utilized to establish gene function (functional genomics) (DeRisi, 1997). Recent advances in mass spectrometry and proteomics have made these fields high-throughput. Bioinformatics is an essential part of drug discovery, pharmacology, biotechnology, genetic engineering and a wide variety of other biological research areas. In the context of these proceedings, we emphasize that machine learning approaches, such as neural networks, hidden Markov models, or kernel machines, have emerged as good mathematical methods for analyzing (i.e. classifying, ranking, predicting, estimating and finding regularities on) biological datasets (Baldi, 1998). The field of bioinformatics has presented challenging problems to the machine learning community and the algorithms developed have resulted in new biological hypotheses. In summary, with the huge amount of information a mutually beneficial knowledge feedback has developed between theoretical disciplines and the life sciences. As further reading, we recommend the excellent “Bioinformatics: A Machine Learning Approach” (Baldi, 1998), which gives a thorough insight into topics, methods and common problems in Bioinformatics. The next section introduces the most important subfields of bioinformatics and computational biology. We go on to discuss current issues in bioinformatics and what we see are future trends.
Bioinformatics is a wide field covering a broad range of research topics that can broadly be defined as the management and analysis of data from generated by biological research. In order to understand bioinformatics it is essential to be familiar with at least a basic understanding of biology. The central dogma of molecular biology: DNA (a string of As, Cs, Gs and Ts) encodes genes which are transcribed into RNA (comprising As, Cs, Gs and Us) which are then generally translated into proteins (a string of amino acids – also denoted by single letter codes). The physical structure of these amino acids determines the proteins structure, which determines its function. A range of textbooks containing exhaustive information is available from the NCBI’s website (http://www.ncbi.nlm.nih.gov/).