ABBA
ABBA: Assembly Boosted By Amino acid sequences
Overview
Assembly Boosted By Amino acid sequence is a comparative gene assembler, which uses amino acid sequences from predicted proteins to help build a better assembly. see the journal paper.
For additional information on short read assembly check the following University of Maryland CBCB web sites:
Download
NOTE : ABBA does protein assembly but doesn't find the reference proteins to assemble. You will need to find the proteins running off the ends of contigs separately and then pass the proteins to ABBA to fill in the gaps.
- Two ways to find the reference proteins:
- Do a draft annotation of the genome using a annotation pipline. ABBA will not annotate your assembly.
- Align the draft assembly contigs to a close relative and find where the contig ends intersect protein coding regions.
ABBA is built on top of the AMOS framework but has it's own distribution. The AMOS framework is included in the ABBA tarball and will install with AMOS if you don't already have AMOS installed. The tarball here: ftp://ftp.cbcb.umd.edu/pub/data/dsommer/abba.tgz
References
Gene-Boosted Assembly of a Novel Bacterial Genome from Very Short Reads.
Salzberg SL, Sommer DD, Puiu D, Lee VT 2008 PLoS Computational Biology 4(9): e1000186 doi:10.1371/journal.pcbi.1000186
Acknowledgements
The development of ABBA was supported by the National Institutes of Health under grants R01-LM06845 and R01-LM007938 to SLS.