Human-Assisted Global Alignment (HAGA)
HAGA is a computer program written in Java that helps biologists and other genomic researchers in the complicated process of genome alignment. Most global alignment algorithms are akin to edit-distance algorithms and weight each nucleotide to perform the alignment. HAGA allows the user to intervene with the alignment--hand labeling specific nucleotide groups. This intervention is observed by the HAGA program which learns from this information to perform better future alignments.
This program was a class project at MIT for "Introduction to Computational Biology" in Fall of 2007. The authors are Shanon Iyo and Robert Toscano.
The architecture is broken into three modular components. The user interface allows a human to view alignments and define labels on the sequences. The labeler is responsible for learning the labels specified by the user, and applying these labels to unlabeled sequence regions. Our labeler implementation uses a HMM. The aligner performs global sequence alignment on two labeled sequences. Our system uses the Needleman-Wunsch algorithm, modified to favor bases with the same label. HAGA is designed so that any of these components may be interchanged, as long as the replacement follows the same interface. For example, human label definitions could be replaced or augmented with an mRNA recognizer, or an HMM labeler implementation could be replaced with a neural net solution.
The alignment process follows the steps below:
  1. User (or other pattern recognizer) applies labels to sequences
  2. Labeler is trained on defined labels
  3. Labeler applies labels to unlabeled sequence regions
  4. Aligner performs alignment on the labeled sequences
  5. User interface is updated to display the new alignment
a description of what we proposed to build for our class project (PDF)
contains a complete description of HAGA: its design and implementation as well as some evaluation results of aligning real genomic data (PDF)
a concise presentation of HAGA (OpenOffice)
an executable HAGA program (JAR)
the source code of HAGA (GitHub)
The different components of the HAGA user interface.
An annotated image of the HAGA user interface
A sequence before performing global alignment.
An example of the HAGA user interface before global alignment.
A sequence after performing global alignment with HAGA.
An example of the HAGA user interface after global alignment.
