Bambus2

From AMOS WIKI
Revision as of 19:16, 7 April 2014 by Decaos (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Sergey Koren and Mihai Pop

Scaffolding represents the task of ordering and orienting contigs by incorporating additional information about their relative placement along the genome. The original Bambus package was the first general purpose scaffolders made available as an open source package. We are happy to announce the arrival of Bambus 2.0, the second generation Bambus scaffolder available as an open source package. While most other scaffolders are closely tied to a specific assembly program, Bambus accepts the output from most current assemblers and provides the user with great flexibility in choosing the scaffolding parameters. In particular, Bambus is able to accept contig linking data other than specified by mate-pairs. Such sources of information include alignment to a reference genome (Bambus can directly use the output of MUMmer), physical mapping data, or information about gene synteny.

Getting data into Bambus 2 requires you convert your assembly to AMOS format. Here is my recipe:

toAmos \
 -s my.fa \
 -c my.contig \
 -m my.mates \
 -o my.afg

You need the .fa to list the contigs within the GDE-like contig file (annoying but true). You don't need accurate sequences in the .fa, you just need something to make the format valid. The .contig and .mates are as expected for Bambus.

The resulting .afg is then 'banked' with:

bank-transact -c \
 -b my.bnk \
 -m my.afg


Bambus2 is composed of a series of scripts. For instruction about how to use them see: Bambus 2.0/quick start guide.

The easiest way to use/run Bambus2 is to use metAMOS, available on github.

Alternatively, there is a Python script to facilitate running Bambus2 in one quick command:

goBambus2

which returns:

run: goBambus2 <input reads or contigs or amos bank name> <output prefix> [options]
eg.: goBambus2 example.contigs myoutput --all --contigs
This script is designed to run the Bambus pipeline and takes either reads or contigs plus XML Trace Archive data as input and outputs scaffolds
For further info please contact the Bambus 2 authors: Sergey Koren and Mihai Pop

For example, you could run:

goBambus2 brucella.seq myScaff --all --reads 

More information is available at http://www.cbcb.umd.edu/software/bambus/