Minimo

From AMOS WIKI
Revision as of 06:23, 9 December 2010 by Floflooo (Talk | contribs)

Jump to: navigation, search

Overview

Minimo is largely based on Minimus, and as such favours assembly quality to speed. Use on moderately-sized data! Minimo follows the Overlap-Layout-Consensus paradigm just like Minimus.

The main advantage of Minimo over Minimus is that it takes simple FASTA files as input and generates contigs formatted in ACE and FASTA. In addition two parameters can be used to tune the assembly stringency (minimum overlap length and minimum identity).

Generally, decreasing the minimum overlap identity results in a less fragmented assembly, but likely less faithful, as sequencing errors or small varitions between closely related species (in the case of metagenomic data) might cause chimeric contigs. Similarly, decreasing the minimum overlap length might produce less fragmented, less faithful assemblies. However, increasing the minimum overlap length may sometimes also produce better assemblies by resolving the assembly of small repeated regions.

Documentation

Documentation on how to run Minimo is obtained by typing:

  Minimo -h

The usage message is:

 Usage:
    Minimo FASTA_IN [options]
 Options:
    -D QUAL_IN=<file>   Input quality score file
    -D GOOD_QUAL=<n>    Quality score to set for bases within the clear
                        range if no quality file was given (default: 30)
    -D BAD_QUAL=<n>     Quality score to set for bases outside clear range
                        if no quality file was given (default: 10). If your
                        sequences are trimmed, try the same value as GOOD_QUAL.
    -D MIN_LEN=<n>      Minimum contig overlap length (at least 20 bp,
                        default: 35)
    -D MIN_IDENT=<d>    Minimum contig overlap identity percentage (between 0
                        and 100 %, default: 98)
    -D ALN_WIGGLE=<d>   Alignment wiggle value (from 2 for short reads to 15 for
                        long reads, default: 2)
    -D FASTA_EXP=<n>    Export results in FASTA format (0:no 1:yes, default: 0)
    -D ACE_EXP=<n>      Export results in ACE format (0:no 1:yes, default: 0)
    -D OUT_PREFIX=< s>   Prefix to use for the output file path and name

Basic usage

To run Minimo will you need a set of sequence files. Assuming you have a set of reads in fasta format called my_reads.fa, you can run minimus with the following two commands:

 Minimo my_reads.fa

To export the contigs in a FASTA file or in ACE format (i.e. for downstream processing), use the FASTA_EXP and ACE_EXP options:

 Minimo my_reads.fa -D FASTA_EXP=1 -D ACE_EXP=1

If you need to use a specific overlap length or identity between reads of a contig, try:

 Minimo my_reads.fa -D MIN_LEN=80 -D MIN_IDENT=90