 |
AMOS::AmosLib: a Perl module
for processing AMOS message files
|
Overview
The interface to the AMOS package is done through AMOS message
files. This representation is described in detail here,
and was inspired by the interchange format developed at Celera Genomics for use in Celera Assembler.
In order to help those who want to directly read these files
we provide in the AMOS::AmosLib Perlmodule a set of routines that
abstract the message structure.
AMOS::AmosLib is fully documented through perldoc.
System requirements
AMOS::AmosLib is written in Perl. It requires Perl 5.6.0 or
newer and was tested on several Unix systems including Linux RedHat
7.3, OSF5.1, Sun Solaris, and Linux SuSE 9.1 and should run on most
systems UNIX systems.
Obtaining AMOS::AmosLib
AMOS::AmosLib can be downloaded as a part of
the AMOS package.
This software is OSI Certified
Open Source Software.
Documentation
SYNOPSIS
use AmosLib;
AMOS/Celera Assembler message processing
my $rec =
getRecord(\*STDIN);
Reads from stdin the text between "extreme" { and } .
for example if the input is:
{A
{B
}
}
getRecord eturns the whole: {A{B}}
my($id,
$fields, $recs) = parseRecord($rec);
Parses a record and returns a triplet consisting of - record type -
hash of fields and values - array of sub-records
my($id) =
getCAId($CAid);
Obtains the ID from a "paired" id, that is, converts (10, 1000) into
10. If the Id is not a pair in parantheses, it returns the input.
Thus, getCAId('(10,
1000)') returns 10 while getCAId("abba") returns
"abba".
Fasta file creation
printFastaSequence($file,
$header, $seq);
Prints sequence in Fasta format Inputs are: $file - output file
opened for writing $header - Fasta header (without >) $seq -
sequence to be written
printFastaQual($file,
$header, $qual);
Prints quality values in Fasta format. Inputs are: $file - output
file $header - fasta header (without >) $qual - string of quality
values
Sequence processing
my($rev) =
reverseComplement($seq);
Reverse complements a sequence.
TIGR .contig format generation
printContigRecord($file,
$id, $len, $nseq, $sequence, $how);
Prints contig in specified format Inputs are: $file - output file
(opened for writing) $id - contig ID $len - contig length $nseq -
number of sequences in contig (same as number of sequence records
that will follow the contig $sequence - consensus sequence for the
contig $how - what type of output is required: contig - TIGR .contig
format asm - TIGR .asm format fasta - multi-fasta format
printSequenceRecord($file,
$name, $seq, $offset, $rc, $seqleft,
$seqright, $asml, $asmr, $type);
Prints the record for a sequence aligned to a contig Inputs are:
$file - output file opened for writing $name - sequence name $seq -
actual sequence $offset - offset in consensus $rc - "RC" if sequence
is reverse complemented, "" otherwise $seqleft, $seqright -
alignment range within sequence $asml, $asmr - alignment range
within consensus $type - type of output: contig - output is in TIGR
.contig format asm - output is in TIGR .asm format
Contact Information
Please direct your questions and suggestions to:
Acknowledgements
The development of AMOS::AmosLib was supported by the National Science Foundation under grant KDI-9980088
and by the National Institutes of Health
under grant R01-LM06845.
|