Peptides fulfill many tasks in controlling and regulating cellular functions and are key molecules in systems biology. There is a great demand in science and industry for a fast search of innovative peptide structures. In this chapter we introduce a combination of a computer-based guided search of novel peptides in sequence space with their biological experimental validation. The computer-based search uses an evolutionary algorithm that includes artificial neural networks as fitness function and a mutation operator, called the PepHarvester. Optimization occurs during 100 iterations. This system, called DARWINIZER, is applied in the de novo design of neutralizing peptides against autoantibodies from DCM (dilatative cardiomyopathy) patients. Another approach is the optimization of peptide sequences by an ant colony optimization process. This biologically-oriented system identified several novel weak binding T-cell epitopes.
I highlight novel techniques for molecule design especially peptide design and molecular feature extraction, which can be applied when three-dimensional molecular structures are not available. A necessary prerequisite for any rational attempt to identify or even design molecules with a desired property or activity is an accurate model of the underlying sequence- (structure)-activity relationship (SAR). Such SAR models serve as guideline in the search for novel and optimized compounds in evolutionary design cycles which have become possible due to advances in both compound generation and screening technology. It is obvious that the quality of the model determines the success rate of this multi-dimensional design process. Only if a relevant SAR model is used a rational molecular design can be successful (Wrede, Schneider 1994; Schneider, Soo 2003; Wrede, Filter 2006).
Key Terms in this Chapter
Ant Colony Optimization: Stochastic optimisation procedure imitating the ant foraging behaviour. Method allows to visualize the path through the sequence space.
Pattern Recognition: Process of classifying patterns according to common features. Feature extraction is therefore an ultimate prerequisite for the process of pattern recognition.
Autoimmune Disease: Disease caused by the adaptive immune system responses to self antigens.
Focused Libraries: Describe an enriched peptide or small molecule library. The number of found active molecules is significantly larger than a randomly picked subset on average.
DCM Dilatative Cardiomyopathy: Severe heart disease, here the autoimmune disease with autoantibodies directed against the ß-adrenergic receptor leading to permanent stimulation of the heart beat frequency.
Sammon Mapping: Non-linear mapping to approximate local geometric relationships on a low dimensional space. It is a non-linear mapping procedure.( Böhm, Schneider, 2000 )
ELISA Enzyme Linked Immunosorbent Assay: Serological assay in which an antigen is detected by an enzyme-linked antibody that converts a colourless substrate into a coloured product.
Feature Extraction: Process of reducing data by measuring certain properties or features. These features are used in a classifier.
Amino Acid Distance Matrix: Calculation of the number of the Euclidian distances between all 20 amino acids according to their physicochemical properties and their genetic coding.
PepHarvester: Algorithm to generate a focused library starting from a single seed peptide.
SAR: Structure (Sequence) Activity Relation, used here in the context of amino acid sequence activity relation. This relation is approximated by a stochastic procedure like artificial neural networks.
DARWINIZER: Computer-based simulating molecular evolution cycle for the optimisation of peptide sequences. It is a combination of an artificial fitness function (e.g. trained artificial neural network) for selecting out the best mutated sequence offsprings. Mutation operator works like the PepHarvester algorithm.
MHC I: Major Histocompatibility Complex. General name for membrane bound glycoprotein of highly polymorphic nature presenting peptide antigens to T-cells. They are also known as histocompatibility antigens ( Janeway et al., 2001 ).
De Novo Design Cycle: Building novel molecules with a given function starting from a model. A model can be the specific knowledge about receptor ligand interaction.
PCA: Principle Component Analysis. Technique seeking a projection which represents the data in a best way. The new coordinates can be considered as linear combinations of the original descriptor axes often treated as factors(principle components) ( Schneider, Baringhaus, 2008 ).