Selecting Probes for DNA Arrays")?> One problem arising frequently in the design of DNA chips is the selection of target specific probes. Given a set of genomic sequences, the task is to find at least one short subsequence for each of the target sequences in this set. That subsequence will then be attached to the chip, and must be chosen such that it will not hybridize to any but the intended target. The problem is related to primer design for PCR experiments, where minimizing cross-hybridization errors is an important issue as well, even though certain different constraints apply.

The problem is complicated if sequences are to be chosen such that they will properly hybridize to a group (e.g. a complete family of organisms), but not to any organism from a different group.

Traditionally, this task has been tackled manually through visually checking the output of sequence alignments. Several algorithms have been published to automate the process, most of which constrain themselves to mere string comparisons. Manual inspection of alignments is very time consuming, and worse yet, results of string algorithms may be unreliable, as no thermodynamic considerations are being made.

We present an efficient algorithm for the array probe selection problem. Melting temperatures and free energy for each possible sequence-probe interaction are calculated using an extended nearest neighbor model, allowing for mismatches. Model parameters can easily be adapted when new parameters are published, and alternatively parameters for DNA or RNA assays can be used. Hairpins, bulge loops and other secondary structure can also be considered.

The algorithm can easily be modified to select family-specific probes.

Data:")?> The following files show sample data for the HIV-1 subtypes as referred to in the published paper.
HIV1.FASTA Fasta format file containing the HIV1 subtypes used.
HIV1.PRB Oligo probes selected by the algorithm

Contact:")?>Lars Kaderali
kaderali@zpr.uni-koeln.de

Download")?> GCB 2000 Poster (PS GZIP)
GCB 2000 Poster Abstract (PS)
Diploma Thesis (Fulltext) PS GZIP
Paper (PDF), published in Bioinformatics 2002; 18(10):1340-9, (c) Oxford University Press.
ISMB 2001 Poster (EPS)