Minimus

From AMOS WIKI
Revision as of 20:49, 3 June 2009 by Mcschatz (Talk | contribs)

Jump to: navigation, search

Overview

minimus is an assembly pipeline designed specifically for small data-sets, such as the set of reads covering a specific gene. Note that the code will work for larger assemblies (we have used it to assemble bacterial genomes), however, due to its stringency, the resulting assembly will be highly fragmented. For large and/or complex assemblies the execution of Minimus should be followed by additional processing steps, such as scaffolding.

minimus follows the Overlap-Layout-Consensus paradigm and consists of three main modules:

  • overlapper - computes the overlaps between the reads using a modified version of the Smith-Waterman local alignment algorithm
  • tigger - uses the read overlaps to generate the layouts of reads representing individual contigs
  • make-consensus - refines the layouts produced by the tigger to generate accurate multiple alignments within the reads

minimus uses as AMOS messages as both the inputs and the outputs (see documentation). Two utilities are provided to process these files: tarchive2amos - a versatile converter from trace archive .seq, .qual, and .xml information into AMOS formatted data; amos2ace - a converter from AMOS formatted data to the .ACE assembly format. In addition, the AMOS::AmosLib Perl module is provided as a tool for users who prefer to write their own conversion utilities. Please see the documentation included with the distribution for more information.

minimus is part of the AMOS package - a collaborative effort to develop a modular open-source framework for assembly development.

minimus2 is a modified version of the minimus pipeline designed for merging two sequence sets. Instead of hash-overlap it uses a nucmer based overlap detector which is much faster.


System requirements

minimus consists of a collection of C/C++ modules running under Unix. The code has been tested under gcc 2.9x and 3.x on Linux RedHat 7.3, Mac OSX, and OSF1 V5.1. We expect the code will compile with minimal changes on any other Unix-based operating system.


Obtaining minimus

This software is OSI Certified Open Source Software.

minimus can be downloaded as part of the AMOS package.


Documentation

Compiling minimus involves the following steps:

1. Unpack the AMOS distribution:

gzip -dc amos-<version>.tar.gz | tar -xf -
(replace <version> with the current version of the distribution.


2. Change directories to the top level of the distribution:

cd amos-<version>


3. Run the configuration script that identifies the specific configuration of your system.

./configure --prefix=/usr/local/
By default the code will be installed in the directory from which you ran the .configure command

4. Compile the code

make all

5. Install the binaries in the appropriate locations

make install 

Documentation on running minimus is included with the distribution in the /docs subdirectory.


Examples

Examples of a flu assembly and a Zebrafish gene can be found in the test/minimus directory created when the AMOS distribution is untarred. Documentation on the examples is included with the distribution in /docs/minimus.README.


Acknowledgements

The development of minimus was supported by the National Institutes of Health under grants R01-LM06845 and R01-LM007938 to SLS and by Department of Homeland Security cooperative agreement W81XWH-05-2-0051.