Bambus2
Scaffolding represents the task of ordering and orienting contigs by incorporating additional information about their relative placement along the genome. The original Bambus package was the first general purpose scaffolders made available as an open source package. We are happy to announce the arrival of Bambus 2.0, the second generation Bambus scaffolder available as an open source package. While most other scaffolders are closely tied to a specific assembly program, Bambus accepts the output from most current assemblers and provides the user with great flexibility in choosing the scaffolding parameters. In particular, Bambus is able to accept contig linking data other than specified by mate-pairs. Such sources of information include alignment to a reference genome (Bambus can directly use the output of MUMmer), physical mapping data, or information about gene synteny.
Getting data into Bambus 2 requires you convert your assembly to AMOS format. Here is my recipe:
toAmos \ -s my.fa \ -c my.contig \ -m my.mates \ -o my.afg
You need the .fa to list the contigs within the GDE-like contig file (annoying but true). You don't need accurate sequences in the .fa, you just need something to make the format valid. The .contig and .mates are as expected for Bambus.
The resulting .afg is then 'banked' with:
bank-transact -c \ -b my.bnk \ -m my.afg
Bambus2 is composed of a series of scripts. For instruction about how to use them see: Bambus 2.0/quick start guide.
The easiest way to use/run Bambus2 is to use metAMOS, available on github.
Alternatively, there is a Python script to facilitate running Bambus2 in one quick command:
goBambus2
which returns:
run: goBambus2 <input reads or contigs or amos bank name> <output prefix> [options] eg.: goBambus2 example.contigs myoutput --all --contigs This script is designed to run the Bambus pipeline and takes either reads or contigs plus XML Trace Archive data as input and outputs scaffolds For further info please contact the Bambus 2 authors: Sergey Koren and Mihai Pop
For example, you could run:
goBambus2 brucella.seq myScaff --all --reads
More information is available at http://www.cbcb.umd.edu/software/bambus/