PASS2: A Database of Structure-Based Sequence Alignments of Protein Structural Domain Superfamilies

PASS2: A Database of Structure-Based Sequence Alignments of Protein Structural Domain Superfamilies

Karuppiah Kanagarajadurai, Singaravelu Kalaimathy, Paramasivam Nagarajan, Ramanathan Sowdhamini
Copyright: © 2011 |Pages: 14
DOI: 10.4018/jkdb.2011100104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

A detailed comparison of protein domains that belong to families and superfamilies shows that structure is better conserved than sequence during evolutionary divergence. Sequence alignments, guided by structural features, permit a better sampling of the protein sequence space and effective construction of libraries for fold recognition. Sequence alignments are useful evolutionary models in defining structure-function relationships for protein superfamilies. The PASS2 database, maintained by the authors, presents alignments of proteins related at the superfamily level and characterised by low sequence similarity. The number of new superfamilies increased to 47% compared with the previous PASS2 version, which shows the crucial importance of updating the PASS2 database. In the current release of the PASS2 database, they align protein superfamilies using a structural alignment protocol. The authors also introduce two alignment assessment methods that depend on the average structural deviations of domains and the extent of conserved secondary structures. They also integrate new and important structural and sequence features at the superfamily level into the database. These features are conserved-unconserved blocks in proteins, spatial distribution of sequences using principal component analysis and a statistical view for each superfamily. The authors suggest that highly structurally deviant superfamily members could be removed as outliers, so that such extreme distant relationships will not obscure the alignment. They report a nearly-automated, updated version of the superfamily alignment database, consisting of 1776 superfamilies and 9536 protein domains, that is in direct correspondence with the SCOP (1.73) database.
Article Preview
Top

Materials And Methods

Protein Structural Domain Dataset

The information about protein structural domains and their boundaries were obtained from SCOP 1.73v release (Andreeva et al., 2008) and their corresponding structural coordinates, which are having ≤40% sequence identity at their superfamily level, were downloaded from ASTRAL compendium (Chandonia et al., 2004). The current structural database was constructed as in the previous PASS2 version (Bhaduri, Pugalenthi, & Sowdhamini, 2004 2004) with some modifications such as the inclusion of assessment of alignments (Figure 1). According to the number of structural entries in each superfamily, in this update, we have categorized them as single member superfamilies (SMS), two member superfamilies (TMS) and multi-member superfamilies (MMS).

Figure 1.

Flowchart for rigorous structure-based sequence alignment of distantly related proteins

jkdb.2011100104.f01

The flowchart describes three phases of the algorithm, namely the initial alignment phase, the final alignment phase and the alignment assessment phase. The initial alignment phase includes building the initial alignment of Two member superfamilies by using programs such as MINRMS or ClustalW or MALIGN and multi-member superfamilies by using programs such as STAMP. The final alignment phase includes deriving the final alignment from the initial equivalences of initial alignment by using the JOY and COMPARER package. The alignment assessment phase includes assessment of the final alignment for extent of structural deviations and secondary structural equivalences. Various features were provided and the new features have been marked in pink.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 8: 2 Issues (2018)
Volume 7: 2 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing