Article Preview
TopIntroduction
Biomolecular computing and programming have brought forth the importance of the theoretical problem of deep understanding of the thermodynamics hybridization for a variety of applications, such as self-assembly (Qian & Winfree, 2011; Seeman, 2006), natural language processing (Garzon et al., 2009; Neel at al., 2006; Bobba et al., 2006) and DNA-based memories (Neel & Garzon, 2008) and, more recently, biological phylogenies based purely on whole-genomic DNA (Garzon & Wong, 2011). The primary and critical tool in all these applications is the appropriate sets of DNA molecular ensembles that encode inputs to computational problems or serve as building blocks for the appropriate nanostructures in order to guarantee that the desirable reactions take place as intended, amidst the tendency of DNA molecules to form other structures due to the uncertainty and variability inherent in hybridization affinity. This Codeword Design problem has seen some progress in the last decade in at least two subareas. First, in searching and/or building such DNA code sets (Garzon et al., 2009; Deaton et al., 2006; Tulpan et al., 2005; Chen et al., 2006), in which the size of feasible computational problems or self-assembled nanostructures is usually directly related to the largest ensemble of DNA molecules that satisfy a given set of crosshybridization and noncrosshybrization constraints. The second and perhaps more important area deals with developing the appropriate theoretical framework to understand and analyze this type of problems, organize the knowledge about the subject in a systematic manner, and explore the power and limitations of biomolecules at large.