
|
This page contains the source code and databases necessary for performing word count annotations of large DNA sequences, such as the human genome, using an exact pattern-matching algorithm developed by John Healy. The term “Mer-Engine” is used to refer collectively to the search algorithm, its implementation and the data structures that embody the search space. The paper Annotating Large Genomes With Exact Word Matches, which details the algorithm and some of its applications, appears in the October 2003 edition of Genome Research. This site is intended to act as a companion to the paper and as a resource for all who may be interested in applying the Mer-Engine to their work. The Source Code area of the site provides a C++ implementation of the search algorithm for download. In addition, some Perl code written by Elizabeth Thomas will be provided for creating a graphical display of word count annotations in the form of a histogram. The Mer Databases section supplies the pre-processed Mer-Engine databases for several genomes. |