Difference between revisions of "Minimus2"

From AMOS WIKI
Jump to: navigation, search
 
Line 9: Line 9:
 
     -D MINID=n    \  # Minimum overlap %id for align. (Def 94)
 
     -D MINID=n    \  # Minimum overlap %id for align. (Def 94)
 
     -D MAXTRIM=n      # Maximum sequence trimming length (Def 20bp)
 
     -D MAXTRIM=n      # Maximum sequence trimming length (Def 20bp)
 +
 +
  prefix is the base name of an [[AFG format]] file.
  
 
REFCOUNT should be the set to the number of sequences in the first set in order to align one set against the other (S1:S2). By default REFCOUNT=0 and an all vs all alignment is run (S1+S2:S1+S2 - same as minimus).
 
REFCOUNT should be the set to the number of sequences in the first set in order to align one set against the other (S1:S2). By default REFCOUNT=0 and an all vs all alignment is run (S1+S2:S1+S2 - same as minimus).

Latest revision as of 18:46, 3 December 2009

minimus2 is a modified version of the minimus pipeline designed for merging one or two sequence sets (S1,S2). It uses a nucmer based overlap detector which is much faster than the Smith-Waterman hash-overlap program used by minimus.

Usage:

 minimus2 prefix  \
   -D REFCOUNT=n  \  # Number of sequences is the first set
   -D OVERLAP=n   \  # Minimum overlap (Default 40bp)
   -D CONSERR=f   \  # Maximum consensus error (0..1) (Def 0.06)
   -D MINID=n     \  # Minimum overlap %id for align. (Def 94)
   -D MAXTRIM=n      # Maximum sequence trimming length (Def 20bp)
 prefix is the base name of an AFG format file.

REFCOUNT should be the set to the number of sequences in the first set in order to align one set against the other (S1:S2). By default REFCOUNT=0 and an all vs all alignment is run (S1+S2:S1+S2 - same as minimus). Example: Let's say we have 2 sets (S1 & S2). There are 917 sequences in S1 and 1668 in S2.

 grep -c "^>" S1.seq S2.seq
   S1.seq:917
   S2.seq:1668

The sets should be merged and converted to AMOS format:

 cat S1.seq S2.seq > S1-S2.seq
 toAmos -s S1-S2.seq -o S1-S2.afg

Then minimus2 should be run of the merged set:

 minimus2 S1-S2 -D REFCOUNT=917

Input:

 S1-S2.afg : AMOS message file that contains RED/FRG messages for all the reads in the two datasets.

Output:

 S1-S2.fasta :          contig sequences
 S1-S2.singletons.seq : singleton sequences

Note: This pipeline has been introduced to the AMOS package starting with the release 2.0.8. If you have an older version of the AMOS package installed, it is highly recommended to upgrade it to the latest version

Alternatively, the new file could be manually downloaded and installed from the following location: minimus2