AMOS::AmosLib: a Perl module for processing AMOS message files

Overview

The interface to the AMOS package is done through AMOS message files. This representation is described in detail here, and was inspired by the interchange format developed at Celera Genomics for use in Celera Assembler.

In order to help those who want to directly read these files we provide in the AMOS::AmosLib Perlmodule a set of routines that abstract the message structure.

AMOS::AmosLib is fully documented through perldoc.

System requirements

AMOS::AmosLib is written in Perl. It requires Perl 5.6.0 or newer and was tested on several Unix systems including Linux RedHat 7.3, OSF5.1, Sun Solaris, and Linux SuSE 9.1 and should run on most systems UNIX systems.

Obtaining AMOS::AmosLib

AMOS::AmosLib can be downloaded as a part of the AMOS package.

This software is OSI Certified Open Source Software.

 

Documentation

SYNOPSIS
use AmosLib;

AMOS/Celera Assembler message processing

my $rec = getRecord(\*STDIN);
Reads from stdin the text between "extreme" { and } .

for example if the input is:

{A
{B
}
}


getRecord eturns the whole: {A{B}}

my($id, $fields, $recs) = parseRecord($rec);
Parses a record and returns a triplet consisting of - record type -
hash of fields and values - array of sub-records

my($id) = getCAId($CAid);
Obtains the ID from a "paired" id, that is, converts (10, 1000) into
10. If the Id is not a pair in parantheses, it returns the input.
Thus, getCAId('(10, 1000)') returns 10 while getCAId("abba") returns
"abba".

Fasta file creation

printFastaSequence($file, $header, $seq);
Prints sequence in Fasta format Inputs are: $file - output file
opened for writing $header - Fasta header (without >) $seq -
sequence to be written

printFastaQual($file, $header, $qual);
Prints quality values in Fasta format. Inputs are: $file - output
file $header - fasta header (without >) $qual - string of quality
values

 

Sequence processing

my($rev) = reverseComplement($seq);
Reverse complements a sequence.

 

TIGR .contig format generation

printContigRecord($file, $id, $len, $nseq, $sequence, $how);
Prints contig in specified format Inputs are: $file - output file
(opened for writing) $id - contig ID $len - contig length $nseq -
number of sequences in contig (same as number of sequence records
that will follow the contig $sequence - consensus sequence for the
contig $how - what type of output is required: contig - TIGR .contig
format asm - TIGR .asm format fasta - multi-fasta format

printSequenceRecord($file, $name, $seq, $offset, $rc, $seqleft,
$seqright, $asml, $asmr, $type);

Prints the record for a sequence aligned to a contig Inputs are:
$file - output file opened for writing $name - sequence name $seq -
actual sequence $offset - offset in consensus $rc - "RC" if sequence
is reverse complemented, "" otherwise $seqleft, $seqright -
alignment range within sequence $asml, $asmr - alignment range
within consensus $type - type of output: contig - output is in TIGR
.contig format asm - output is in TIGR .asm format

Contact Information

Please direct your questions and suggestions to:

Acknowledgements

The development of AMOS::AmosLib was supported by the National Science Foundation under grant KDI-9980088 and by the National Institutes of Health under grant R01-LM06845.