#include <Contig_AMOS.hh>
Inheritance diagram for AMOS::Contig_t:
Public Member Functions | |
Contig_t () | |
Constructs an empty Contig_t object. | |
Contig_t (const Contig_t &source) | |
Copy constructor. | |
~Contig_t () | |
Destroys a Contig_t object. | |
virtual void | clear () |
Clears all object data, reinitializes the object. | |
Pos_t | gap2ungap (Pos_t gap) const |
Translates a gapped position to an ungapped position. | |
Pos_t | ungap2gap (Pos_t ungap) const |
Translates an ungapped position to a gapped position. | |
virtual NCode_t | getNCode () const |
Get the AMOS NCode type identifier. | |
const std::vector< Tile_t > & | getReadTiling () const |
Get the tiling of underlying reads. | |
std::vector< Tile_t > & | getReadTiling () |
Size_t | getSpan () const |
Get the span of the read layout. | |
Size_t | getUngappedLength () const |
Get the ungapped consensus length. | |
std::string | getUngappedQualString () const |
Get the ungapped quality score string. | |
std::string | getUngappedQualString (Range_t range) const |
Get an ungapped quality score substring. | |
std::string | getUngappedSeqString () const |
Get the ungapped sequence base string. | |
std::string | getUngappedSeqString (Range_t range) const |
Get an ungapped sequence base substring. | |
virtual void | readMessage (const Message_t &msg) |
Reads in data from a Message object. | |
bool | readUMD (std::istream &in) |
Read a UMD contig message from an input stream. | |
void | setReadTiling (const std::vector< Tile_t > &reads) |
Set the tiling of underlying reads. | |
void | setReadTiling (const Layout_t &layout) |
Set the tiling of underlying reads. | |
virtual void | writeMessage (Message_t &msg) const |
Writes data to a Message object. | |
void | writeUMD (std::ostream &out) const |
Write a UMD contig message to an output stream. | |
std::pair< char, char > | getBase (Pos_t index) const |
Get a single base and its quality score. | |
Size_t | getLength () const |
Get the length of the sequence. | |
std::string | getQualString () const |
Get the quality score string. | |
std::string | getQualString (Range_t range) const |
Get a quality score substring. | |
std::string | getSeqString () const |
Get the sequence base string. | |
std::string | getSeqString (Range_t range) const |
Get a sequence base substring. | |
bool | isCompressed () const |
Checks if the sequence data is compressed. | |
void | setBase (char seqchar, char qualchar, Pos_t index) |
Set a sequence base and its quality score. | |
void | setSequence (const char *seq, const char *qual) |
Set the entire sequence. | |
void | setSequence (const std::string &seq, const std::string &qual) |
Set the entire sequence. | |
const std::string & | getComment () const |
Get the comment string. | |
const Status_t | getStatus () const |
Get the status value. | |
void | setComment (const std::string &comment) |
Set the comment string. | |
void | setStatus (Status_t status) |
Set the status value. | |
const std::string & | getEID () const |
Get the external ID. | |
ID_t | getIID () const |
Get the internal ID. | |
bool | isRemoved () const |
Check if the object is waiting to be removed from the bank. | |
bool | isModified () const |
Check if the object has been modified. | |
bool | isFlagA () const |
Check the value of flag A. | |
bool | isFlagB () const |
Check the value of flag B. | |
void | setEID (const std::string &eid) |
Set the external ID. | |
void | setFlagA (bool flag) |
Set flag A. | |
void | setFlagB (bool flag) |
Set flag B. | |
void | setIID (ID_t iid) |
Set the internal ID. | |
Static Public Attributes | |
const NCode_t | NCODE = M_CONTIG |
The NCode type identifier for this object. | |
Protected Member Functions | |
virtual void | readRecord (std::istream &fix, std::istream &var) |
Read selected class members from a biserial record. | |
virtual void | writeRecord (std::ostream &fix, std::ostream &var) const |
Write selected class members to a biserial record. | |
Static Protected Member Functions | |
uint8_t | compress (char seqchar, char qualchar) |
Compresses a sequence char and quality char into a single byte. | |
std::pair< char, char > | uncompress (uint8_t byte) |
Uncompresses a byte into a sequence and quality char. | |
Protected Attributes | |
uint8_t * | seq_m |
compressed seq and qual data or uncompressed seq | |
uint8_t * | qual_m |
uncompressed qual data | |
Size_t | length_m |
length of the sequence and quality data | |
ID_t | iid_m |
internal ID (integer AMOS identifier) | |
std::string | eid_m |
external ID (anything you want sans newlines) | |
BankFlags_t | flags_m |
bank flags, derived classes may use "nibble" | |
Static Protected Attributes | |
const uint8_t | COMPRESS_BIT = 0x1 |
compressed sequence flag | |
const uint8_t | ADENINE_BITS = 0x0 |
'A' bit | |
const uint8_t | CYTOSINE_BITS = 0x40 |
'C' bit | |
const uint8_t | GUANINE_BITS = 0x80 |
'G' bit | |
const uint8_t | THYMINE_BITS = 0xC0 |
'T' bit | |
const uint8_t | SEQ_BITS = 0xC0 |
sequence bit mask | |
const uint8_t | QUAL_BITS = 0x3F |
quality bit mask |
A Contig_t consists of a consensus sequence with quality scores and a tiling of underlying reads that produce the consensus. The consensus sequence is stored as a 'gapped consensus'. That is the gaps are stored as gap characters in the consensus sequence instead of as a position list. The ungapped version of the consensus can be generated with the getUngapped... methods. Gap characters should be '-' but '*' is also accepted. The compress and uncompress methods inherited from Sequence_t are made private because they would corrupt the gap characters.
Definition at line 35 of file Contig_AMOS.hh.
|
Constructs an empty Contig_t object. Definition at line 75 of file Contig_AMOS.hh. |
|
Copy constructor. Definition at line 84 of file Contig_AMOS.hh. |
|
Destroys a Contig_t object. Definition at line 93 of file Contig_AMOS.hh. |
|
Clears all object data, reinitializes the object. All data will be cleared, but object compression status will remain unchanged. Use the compress/uncompress members to change this info. Reimplemented from AMOS::Sequence_t. Definition at line 100 of file Contig_AMOS.hh. References AMOS::Sequence_t::clear(). Referenced by readMessage(), readUMD(), and writeMessage(). |
|
Compresses a sequence char and quality char into a single byte.
References AMOS::Sequence_t::ADENINE_BITS, AMOS::Char2Qual(), AMOS::Sequence_t::CYTOSINE_BITS, AMOS::Sequence_t::GUANINE_BITS, AMOS::Sequence_t::SEQ_BITS, and AMOS::Sequence_t::THYMINE_BITS. Referenced by AMOS::Sequence_t::clear(). |
|
Translates a gapped position to an ungapped position. This method requires linear time. If the gapped position points to a gap the returned ungapped position will point to the position immediately following the gap.
References AMOS::Pos_t. |
|
Get a single base and its quality score. Retrieves and uncompresses the sequence base for the requested index.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::isCompressed(), AMOS::Sequence_t::length_m, AMOS::Pos_t, AMOS::Sequence_t::qual_m, AMOS::Sequence_t::seq_m, and AMOS::Sequence_t::uncompress(). Referenced by AMOS::Sequence_t::getQualString(), AMOS::Sequence_t::getSeqString(), getUngappedQualString(), getUngappedSeqString(), and AMOS::Sequence_t::writeMessage(). |
|
Get the comment string.
Referenced by AMOS::operator<<(). |
|
Get the external ID.
References AMOS::IBankable_t::eid_m. Referenced by writeUMD(). |
|
Get the internal ID.
References AMOS::ID_t, and AMOS::IBankable_t::iid_m. Referenced by AMOS::Index_t::buildContigFeature(), AMOS::Index_t::buildContigScaffold(), AMOS::Index_t::buildReadContig(), AMOS::Index_t::buildReadLibrary(), AMOS::Index_t::buildScaffoldFeature(), and AMOS::operator<<(). |
|
Get the length of the sequence.
References AMOS::Sequence_t::length_m, and AMOS::Size_t. Referenced by getSpan(), getUngappedQualString(), getUngappedSeqString(), and AMOS::Read_t::writeMessage(). |
|
Get the AMOS NCode type identifier.
Reimplemented from AMOS::Sequence_t. Definition at line 135 of file Contig_AMOS.hh. References AMOS::NCode_t. |
|
Get a quality score substring. Returns a subrange of quality scores [begin, end) or (end, begin]. The reversed range will pull the reverse string of quality scores.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::getBase(), AMOS::Sequence_t::length_m, AMOS::NULL_CHAR, AMOS::Pos_t, and AMOS::Reverse(). |
|
Get the quality score string.
References AMOS::Sequence_t::length_m. Referenced by AMOS::operator<<(). |
|
Definition at line 153 of file Contig_AMOS.hh. |
|
Get the tiling of underlying reads.
Referenced by readUMD(). |
|
Get a sequence base substring. Returns a subrange of sequence bases [begin, end) or (end, being]. The reversed range will pull the reverse complement string of sequences bases.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::getBase(), AMOS::Sequence_t::length_m, AMOS::NULL_CHAR, AMOS::Pos_t, and AMOS::ReverseComplement(). |
|
Get the sequence base string.
References AMOS::Sequence_t::length_m. Referenced by AMOS::operator<<(). |
|
Get the span of the read layout. Returns the difference between min(offset) and max(offset+len). Since the layout is not necessarily sorted, this method requires linear time.
References AMOS::Sequence_t::getLength(), AMOS::Pos_t, and AMOS::Size_t. |
|
Get the status value.
References AMOS::Status_t. |
|
Get the ungapped consensus length. Unlike getLength( ) which is constant time, this method requires linear time.
References AMOS::Pos_t, and AMOS::Size_t. |
|
Get an ungapped quality score substring. Returns a subrange of quality scores [begin, end) or (end, begin] with all the gap scores removed. The range bounds are relative to the gapped consensus sequence, and a reversed range will pull the reverse string of quality scores.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::getBase(), AMOS::Sequence_t::getLength(), AMOS::Pos_t, and AMOS::Reverse(). |
|
Get the ungapped quality score string. Returns the quality string with all the gap scores removed.
References AMOS::Sequence_t::getLength(). |
|
Get an ungapped sequence base substring. Returns a subrange of ungapped sequence bases [begin, end) or (end, begin] with all the gaps (non-alphas) removed. The range bounds are relative to the gapped consensus sequence, and reversed range will pull the reverse complement string of sequence bases.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::getBase(), AMOS::Sequence_t::getLength(), AMOS::Pos_t, and AMOS::ReverseComplement(). |
|
Get the ungapped sequence base string.
References AMOS::Sequence_t::getLength(). |
|
Checks if the sequence data is compressed. Returns true if the Sequence is currently operating in compressed mode, or false if under normal operation.
References AMOS::Sequence_t::COMPRESS_BIT. Referenced by AMOS::Sequence_t::compress(), AMOS::Sequence_t::getBase(), AMOS::Sequence_t::operator=(), AMOS::Sequence_t::readRecord(), AMOS::Sequence_t::setBase(), AMOS::Sequence_t::setSequence(), AMOS::Sequence_t::uncompress(), and AMOS::Sequence_t::writeRecord(). |
|
Check the value of flag A.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::writeMessage(), and AMOS::Overlap_t::writeMessage(). |
|
Check the value of flag B.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::writeMessage(), and AMOS::Overlap_t::writeMessage(). |
|
Check if the object has been modified.
References AMOS::IBankable_t::flags_m. |
|
Check if the object is waiting to be removed from the bank.
References AMOS::IBankable_t::flags_m. |
|
Reads in data from a Message object. Reads the data contained in a Message object and stores it in the Messagable object. Will not complain if incoming message is of the wrong type, will only try and suck out the fields it recognizes. All previous data in the Messagable object will be cleared or overwritten.
Reimplemented from AMOS::Sequence_t. Definition at line 150 of file Contig_AMOS.cc. References clear(), AMOS::M_TILE, and AMOS::Sequence_t::readMessage(). |
|
Read selected class members from a biserial record. Reads the fixed and variable length streams from a biserial record and initializes the class members to the values stored within. Used in translating a biserial IBankable object, and needed to retrieve objects from a bank.
Reimplemented from AMOS::Sequence_t. Definition at line 177 of file Contig_AMOS.cc. References AMOS::Pos_t, AMOS::readLE(), AMOS::Sequence_t::readRecord(), and AMOS::Size_t. |
|
Read a UMD contig message from an input stream. Reads a University of Maryland style contig message and populates the appropriate fields with the values read from the stream. Will throw an exception if a message is found, but is not properly formatted. All fields not included in the message will be reinitialized. Contig EID will be set, and each tile will be stored with Read IID, read offset, and read range. Read range will be flipped to represent reverse complement.
References AMOS_THROW_ARGUMENT, clear(), getReadTiling(), and AMOS::IBankable_t::setEID(). |
|
Set a sequence base and its quality score. Any characters may be used for seq and qualchar unless dealing with a compressed sequence. If compressed, the sequence should only contain A,C,G,T and Ns and quality scores in the range [MIN_QUALITY,MAX_QUALITY], if either of these conditions are not met the information will be lost when the data is compressed (see postconditions below).
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::compress(), AMOS::Sequence_t::isCompressed(), AMOS::Sequence_t::length_m, AMOS::Pos_t, AMOS::Sequence_t::qual_m, and AMOS::Sequence_t::seq_m. Referenced by AMOS::Sequence_t::setSequence(). |
|
Set the comment string.
|
|
Set the external ID. Will only use the characters up to but not including the first newline.
References AMOS::IBankable_t::eid_m, and AMOS::NL_CHAR. Referenced by readUMD(). |
|
Set flag A. Has no effect on the actual object in memory other than setting a flag. This is one of two user accessible flags to be used as needed, the other is flag B.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::readMessage(), and AMOS::Overlap_t::readMessage(). |
|
Set flag B. Has no effect on the actual object in memory other than setting a flag. This is one of two user accessible flags to be used as needed, the other is flag A.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::readMessage(), and AMOS::Overlap_t::readMessage(). |
|
Set the internal ID.
References AMOS::ID_t, and AMOS::IBankable_t::iid_m. |
|
Set the tiling of underlying reads.
|
|
Set the tiling of underlying reads.
|
|
Set the entire sequence. Combines and compresses the sequence and quality data contained in the two STL strings. If current Sequence object is compressed, please refer to the postconditions for the setBase(char,char,Pos_t) operation. All newline characters will be discarded. but the newlines must be in the same location in both the sequence and quality files.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::isCompressed(), AMOS::Sequence_t::length_m, AMOS::NL_CHAR, AMOS::Pos_t, AMOS::Sequence_t::qual_m, AMOS::SafeRealloc(), AMOS::Sequence_t::seq_m, AMOS::Sequence_t::setBase(), and AMOS::Size_t. |
|
Set the entire sequence. Combines and compresses the sequence and quality data contained in the two C strings. If current Sequence object is compressed, please refer to the postconditions for the setBase(char,char,Pos_t) operation. All newline characters will be discarded, but the newlines must be in the same location in both the sequence and quality files.
References AMOS_THROW_ARGUMENT, AMOS::Sequence_t::isCompressed(), AMOS::Sequence_t::length_m, AMOS::NL_CHAR, AMOS::Pos_t, AMOS::Sequence_t::qual_m, AMOS::SafeRealloc(), AMOS::Sequence_t::seq_m, AMOS::Sequence_t::setBase(), and AMOS::Size_t. Referenced by AMOS::Sequence_t::readMessage(). |
|
Set the status value.
References AMOS::Status_t. Referenced by AMOS::Universal_t::readMessage(). |
|
Uncompresses a byte into a sequence and quality char.
References AMOS::Sequence_t::ADENINE_BITS, AMOS::Sequence_t::CYTOSINE_BITS, AMOS::Sequence_t::GUANINE_BITS, AMOS::Qual2Char(), AMOS::Sequence_t::QUAL_BITS, AMOS::Sequence_t::SEQ_BITS, and AMOS::Sequence_t::THYMINE_BITS. |
|
Translates an ungapped position to a gapped position. This method requires linear time.
References AMOS::Pos_t, and AMOS::Size_t. |
|
Writes data to a Message object. Writes the data contained in a Messagable object to a Message object. All previous data in the Message will be cleared or overwritten.
Reimplemented from AMOS::Sequence_t. Definition at line 250 of file Contig_AMOS.cc. References clear(), AMOS::Pos_t, and AMOS::Sequence_t::writeMessage(). |
|
Write selected class members to a biserial record. Writes the fixed and variable length streams to a biserial record. Used in generating a biserial IBankable object, and needed to commit objects to a bank. Should not write the flags, EID, or IID of the object because the bank will handle the storage of these fields on its own.
Reimplemented from AMOS::Sequence_t. Definition at line 276 of file Contig_AMOS.cc. References AMOS::Pos_t, AMOS::Size_t, AMOS::writeLE(), and AMOS::Sequence_t::writeRecord(). |
|
Write a UMD contig message to an output stream. Writes a University of Maryland style contig message to the output stream. Will throw an exception if there was an error trying to write to the output stream.
References AMOS_THROW_IO, and AMOS::IBankable_t::getEID(). |
|
'A' bit Definition at line 47 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::compress(), and AMOS::Sequence_t::uncompress(). |
|
compressed sequence flag Definition at line 46 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::clear(), AMOS::Sequence_t::compress(), AMOS::Sequence_t::isCompressed(), and AMOS::Sequence_t::uncompress(). |
|
'C' bit Definition at line 48 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::compress(), and AMOS::Sequence_t::uncompress(). |
|
external ID (anything you want sans newlines) Definition at line 66 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::getEID(), and AMOS::IBankable_t::setEID(). |
|
bank flags, derived classes may use "nibble" Definition at line 68 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::isFlagA(), AMOS::IBankable_t::isFlagB(), AMOS::IBankable_t::isModified(), AMOS::IBankable_t::isRemoved(), AMOS::IBankable_t::setFlagA(), and AMOS::IBankable_t::setFlagB(). |
|
'G' bit Definition at line 49 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::compress(), and AMOS::Sequence_t::uncompress(). |
|
internal ID (integer AMOS identifier) Definition at line 64 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::getIID(), AMOS::IBankable_t::IBankable_t(), and AMOS::IBankable_t::setIID(). |
|
length of the sequence and quality data Definition at line 43 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::clear(), AMOS::Sequence_t::compress(), AMOS::Sequence_t::getBase(), AMOS::Sequence_t::getLength(), AMOS::Sequence_t::getQualString(), AMOS::Sequence_t::getSeqString(), AMOS::Sequence_t::operator=(), AMOS::Sequence_t::readRecord(), AMOS::Sequence_t::Sequence_t(), AMOS::Sequence_t::setBase(), AMOS::Sequence_t::setSequence(), AMOS::Sequence_t::uncompress(), AMOS::Sequence_t::writeMessage(), and AMOS::Sequence_t::writeRecord(). |
|
The NCode type identifier for this object.
Reimplemented from AMOS::Sequence_t. Definition at line 19 of file Contig_AMOS.cc. |
|
quality bit mask Definition at line 52 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::uncompress(). |
|
uncompressed qual data Definition at line 42 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::clear(), AMOS::Sequence_t::compress(), AMOS::Sequence_t::getBase(), AMOS::Sequence_t::operator=(), AMOS::Sequence_t::readRecord(), AMOS::Sequence_t::Sequence_t(), AMOS::Sequence_t::setBase(), AMOS::Sequence_t::setSequence(), AMOS::Sequence_t::uncompress(), AMOS::Sequence_t::writeRecord(), and AMOS::Sequence_t::~Sequence_t(). |
|
sequence bit mask Definition at line 51 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::compress(), and AMOS::Sequence_t::uncompress(). |
|
compressed seq and qual data or uncompressed seq Definition at line 41 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::clear(), AMOS::Sequence_t::compress(), AMOS::Sequence_t::getBase(), AMOS::Sequence_t::operator=(), AMOS::Sequence_t::readRecord(), AMOS::Sequence_t::Sequence_t(), AMOS::Sequence_t::setBase(), AMOS::Sequence_t::setSequence(), AMOS::Sequence_t::uncompress(), AMOS::Sequence_t::writeRecord(), and AMOS::Sequence_t::~Sequence_t(). |
|
'T' bit Definition at line 50 of file Sequence_AMOS.hh. Referenced by AMOS::Sequence_t::compress(), and AMOS::Sequence_t::uncompress(). |