Mapping Affymetrix Microarray Probes to the Rat Genome via a Persistent Index

Mapping Affymetrix Microarray Probes to the Rat Genome via a Persistent Index

Susan Fairley, John D. McClure, Neil Hanlon, Rob Irving, Martin W. McBride, Anna F. Dominiczak, Ela Hunt
Copyright: © 2010 |Pages: 18
DOI: 10.4018/jkdb.2010100204
(Individual Articles)
No Current Special Offers


A probe mapping technique using a novel implementation of a persistent q-gram index was developed. It guarantees to find all matches that meet certain definitions. These include exact matching of the central 19 bases of 25 base probes, matching the central 19 bases with at most one or three mismatches and exact matching of any 16 bases. In comparison with BLAST and BLAT, the new methods were either significantly faster or identified matches missed by the heuristics. The 16 bp method was used to map the 342,410 perfect match probes from the Affymetrix GeneChip Rat Genome 230 2.0 Array to the genome. When compared with the mapping from Ensembl, the new mapping included over seven million novel matches, providing additional evidence for researchers wishing to further investigate the sources of signals measured in microarray experiments. The results demonstrate the practicality of the index, which could support other q-gram based algorithms.
Article Preview


In this paper two things are addressed. The first is the provision of a novel implementation of a persistent q-gram index using a relational database management system (RDBMS). This implementation allows a greater value of q than has previously been reported. Further, it demonstrates the use of the RDBMS as a means of accessing the information stored on disk, the traditional bottleneck in the use of persistent indices.

Secondly, the index is applied to the task of mapping Affymetrix microarray probes, thereby demonstrating the practical use of the index implementation and providing a data set of use to those using the RAE 230 microarray. The advantage of our mapping approach is that it provides computational guarantees about the returned matches, making it possible to be certain that all matches meeting the chosen definition have been returned. Further, it is demonstrated that this can be achieved in reasonable time. The problems with the mappings provided by Affymetrix and the benefits of re-mapping are described below.

Complete Article List

Search this Journal:
Open Access Articles
Volume 8: 2 Issues (2018)
Volume 7: 2 Issues (2017)
Volume 6: 2 Issues (2016)
Volume 5: 2 Issues (2015)
Volume 4: 2 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing