#include <Sequence_AMOS.hh>
Inheritance diagram for AMOS::Sequence_t:
Public Member Functions | |
Sequence_t () | |
Constructs an empty Sequence_t object. | |
Sequence_t (const Sequence_t &source) | |
Copy constructor. | |
virtual | ~Sequence_t () |
Destroys a Sequence_t object. | |
virtual void | clear () |
Clears all object data, reinitializes the object. | |
void | compress () |
Compress the internal representation of this sequence. | |
std::pair< char, char > | getBase (Pos_t index) const |
Get a single base and its quality score. | |
Size_t | getLength () const |
Get the length of the sequence. | |
virtual NCode_t | getNCode () const |
Get the AMOS NCode type identifier. | |
std::string | getQualString () const |
Get the quality score string. | |
std::string | getQualString (Range_t range) const |
Get a quality score substring. | |
std::string | getSeqString () const |
Get the sequence base string. | |
std::string | getSeqString (Range_t range) const |
Get a sequence base substring. | |
bool | isCompressed () const |
Checks if the sequence data is compressed. | |
virtual void | readMessage (const Message_t &msg) |
Reads in data from a Message object. | |
void | setBase (char seqchar, char qualchar, Pos_t index) |
Set a sequence base and its quality score. | |
void | setSequence (const char *seq, const char *qual) |
Set the entire sequence. | |
void | setSequence (const std::string &seq, const std::string &qual) |
Set the entire sequence. | |
void | uncompress () |
Uncompress the internal representation of this sequence. | |
Sequence_t & | operator= (const Sequence_t &source) |
Assignment (copy) operator. | |
virtual void | writeMessage (Message_t &msg) const |
Writes data to a Message object. | |
const std::string & | getComment () const |
Get the comment string. | |
const Status_t | getStatus () const |
Get the status value. | |
void | setComment (const std::string &comment) |
Set the comment string. | |
void | setStatus (Status_t status) |
Set the status value. | |
const std::string & | getEID () const |
Get the external ID. | |
ID_t | getIID () const |
Get the internal ID. | |
bool | isRemoved () const |
Check if the object is waiting to be removed from the bank. | |
bool | isModified () const |
Check if the object has been modified. | |
bool | isFlagA () const |
Check the value of flag A. | |
bool | isFlagB () const |
Check the value of flag B. | |
void | setEID (const std::string &eid) |
Set the external ID. | |
void | setFlagA (bool flag) |
Set flag A. | |
void | setFlagB (bool flag) |
Set flag B. | |
void | setIID (ID_t iid) |
Set the internal ID. | |
Static Public Attributes | |
const NCode_t | NCODE = M_SEQUENCE |
The NCode type identifier for this object. | |
Protected Member Functions | |
virtual void | readRecord (std::istream &fix, std::istream &var) |
Read selected class members from a biserial record. | |
virtual void | writeRecord (std::ostream &fix, std::ostream &var) const |
Write selected class members to a biserial record. | |
Static Protected Member Functions | |
uint8_t | compress (char seqchar, char qualchar) |
Compresses a sequence char and quality char into a single byte. | |
std::pair< char, char > | uncompress (uint8_t byte) |
Uncompresses a byte into a sequence and quality char. | |
Protected Attributes | |
uint8_t * | seq_m |
compressed seq and qual data or uncompressed seq | |
uint8_t * | qual_m |
uncompressed qual data | |
Size_t | length_m |
length of the sequence and quality data | |
ID_t | iid_m |
internal ID (integer AMOS identifier) | |
std::string | eid_m |
external ID (anything you want sans newlines) | |
BankFlags_t | flags_m |
bank flags, derived classes may use "nibble" | |
Static Protected Attributes | |
const uint8_t | COMPRESS_BIT = 0x1 |
compressed sequence flag | |
const uint8_t | ADENINE_BITS = 0x0 |
'A' bit | |
const uint8_t | CYTOSINE_BITS = 0x40 |
'C' bit | |
const uint8_t | GUANINE_BITS = 0x80 |
'G' bit | |
const uint8_t | THYMINE_BITS = 0xC0 |
'T' bit | |
const uint8_t | SEQ_BITS = 0xC0 |
sequence bit mask | |
const uint8_t | QUAL_BITS = 0x3F |
quality bit mask |
Stores both sequence and quality score data in a space efficient manner (when compressed). Can represent any type of sequence data, but must always be used with both sequence AND quality data. If just one of these two data types is required use a simple character array instead. Can be used in uncompressed mode where a base and its quality occupy 2 bytes, or a compressed mode where a base and its quality are packed into a single byte. In uncompressed mode, any characters are valid for bases and quality scores, however in compressed mode, acceptable sequence bases are A,C,G,T and N (case insensitive) and acceptable quality scores are between MIN_QUALITY and MAX_QUALITY.
Definition at line 36 of file Sequence_AMOS.hh.
|
Constructs an empty Sequence_t object. Sets all members to 0 or NULL Definition at line 134 of file Sequence_AMOS.hh. |
|
Copy constructor. Definition at line 144 of file Sequence_AMOS.hh. |
|
Destroys a Sequence_t object. Frees the memory used for storing the sequence and quality data. Definition at line 156 of file Sequence_AMOS.hh. |
|
Clears all object data, reinitializes the object. All data will be cleared, but object compression status will remain unchanged. Use the compress/uncompress members to change this info. Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 23 of file Sequence_AMOS.cc. References AMOS::Universal_t::clear(), compress(), COMPRESS_BIT, length_m, qual_m, and seq_m. Referenced by AMOS::Read_t::clear(), AMOS::Contig_t::clear(), readMessage(), and writeMessage(). |
|
Compress the internal representation of this sequence. After compression, this object will continue to compress incoming data until the uncompress method is called. Compression packs both a base and a quality score into a single byte, effectively halving the memory requirements for each object. The sequence should only contain A,C,G,T and Ns and quality scores in the range [MIN_QUALITY,MAX_QUALITY], if either of these conditions are not met the information will be lost when the data is compressed (see postconditions below).
References COMPRESS_BIT, isCompressed(), length_m, AMOS::Pos_t, qual_m, and seq_m. Referenced by setBase(). |
|
Compresses a sequence char and quality char into a single byte.
References ADENINE_BITS, AMOS::Char2Qual(), CYTOSINE_BITS, GUANINE_BITS, SEQ_BITS, and THYMINE_BITS. Referenced by clear(). |
|
Get a single base and its quality score. Retrieves and uncompresses the sequence base for the requested index.
References AMOS_THROW_ARGUMENT, isCompressed(), length_m, AMOS::Pos_t, qual_m, seq_m, and uncompress(). Referenced by getQualString(), getSeqString(), AMOS::Contig_t::getUngappedQualString(), AMOS::Contig_t::getUngappedSeqString(), and writeMessage(). |
|
Get the comment string.
Referenced by AMOS::operator<<(). |
|
Get the external ID.
References AMOS::IBankable_t::eid_m. Referenced by AMOS::Contig_t::writeUMD(). |
|
Get the internal ID.
References AMOS::ID_t, and AMOS::IBankable_t::iid_m. Referenced by AMOS::Index_t::buildContigFeature(), AMOS::Index_t::buildContigScaffold(), AMOS::Index_t::buildReadContig(), AMOS::Index_t::buildReadLibrary(), AMOS::Index_t::buildScaffoldFeature(), and AMOS::operator<<(). |
|
Get the length of the sequence.
References length_m, and AMOS::Size_t. Referenced by AMOS::Contig_t::getSpan(), AMOS::Contig_t::getUngappedQualString(), AMOS::Contig_t::getUngappedSeqString(), and AMOS::Read_t::writeMessage(). |
|
Get the AMOS NCode type identifier.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 227 of file Sequence_AMOS.hh. References AMOS::NCode_t. |
|
Get a quality score substring. Returns a subrange of quality scores [begin, end) or (end, begin]. The reversed range will pull the reverse string of quality scores.
References AMOS_THROW_ARGUMENT, getBase(), length_m, AMOS::NULL_CHAR, AMOS::Pos_t, and AMOS::Reverse(). |
|
Get the quality score string.
References length_m. Referenced by AMOS::operator<<(). |
|
Get a sequence base substring. Returns a subrange of sequence bases [begin, end) or (end, being]. The reversed range will pull the reverse complement string of sequences bases.
References AMOS_THROW_ARGUMENT, getBase(), length_m, AMOS::NULL_CHAR, AMOS::Pos_t, and AMOS::ReverseComplement(). |
|
Get the sequence base string.
References length_m. Referenced by AMOS::operator<<(). |
|
Get the status value.
References AMOS::Status_t. |
|
Checks if the sequence data is compressed. Returns true if the Sequence is currently operating in compressed mode, or false if under normal operation.
References COMPRESS_BIT. Referenced by compress(), getBase(), operator=(), readRecord(), setBase(), setSequence(), uncompress(), and writeRecord(). |
|
Check the value of flag A.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::writeMessage(), and AMOS::Overlap_t::writeMessage(). |
|
Check the value of flag B.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::writeMessage(), and AMOS::Overlap_t::writeMessage(). |
|
Check if the object has been modified.
References AMOS::IBankable_t::flags_m. |
|
Check if the object is waiting to be removed from the bank.
References AMOS::IBankable_t::flags_m. |
|
Assignment (copy) operator. Efficiently copies the compressed data from the other Sequence_t.
References isCompressed(), length_m, qual_m, AMOS::SafeRealloc(), and seq_m. |
|
Reads in data from a Message object. Reads the data contained in a Message object and stores it in the Messagable object. Will not complain if incoming message is of the wrong type, will only try and suck out the fields it recognizes. All previous data in the Messagable object will be cleared or overwritten.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 104 of file Sequence_AMOS.cc. References AMOS_THROW_ARGUMENT, clear(), AMOS::F_QUALITY, AMOS::F_SEQUENCE, AMOS::Universal_t::readMessage(), and setSequence(). Referenced by AMOS::Read_t::readMessage(), and AMOS::Contig_t::readMessage(). |
|
Read selected class members from a biserial record. Reads the fixed and variable length streams from a biserial record and initializes the class members to the values stored within. Used in translating a biserial IBankable object, and needed to retrieve objects from a bank.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 127 of file Sequence_AMOS.cc. References isCompressed(), length_m, qual_m, AMOS::readLE(), AMOS::Universal_t::readRecord(), AMOS::SafeRealloc(), and seq_m. Referenced by AMOS::Read_t::readRecord(), and AMOS::Contig_t::readRecord(). |
|
Set a sequence base and its quality score. Any characters may be used for seq and qualchar unless dealing with a compressed sequence. If compressed, the sequence should only contain A,C,G,T and Ns and quality scores in the range [MIN_QUALITY,MAX_QUALITY], if either of these conditions are not met the information will be lost when the data is compressed (see postconditions below).
References AMOS_THROW_ARGUMENT, compress(), isCompressed(), length_m, AMOS::Pos_t, qual_m, and seq_m. Referenced by setSequence(). |
|
Set the comment string.
|
|
Set the external ID. Will only use the characters up to but not including the first newline.
References AMOS::IBankable_t::eid_m, and AMOS::NL_CHAR. Referenced by AMOS::Contig_t::readUMD(). |
|
Set flag A. Has no effect on the actual object in memory other than setting a flag. This is one of two user accessible flags to be used as needed, the other is flag B.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::readMessage(), and AMOS::Overlap_t::readMessage(). |
|
Set flag B. Has no effect on the actual object in memory other than setting a flag. This is one of two user accessible flags to be used as needed, the other is flag A.
References AMOS::IBankable_t::flags_m. Referenced by AMOS::Universal_t::readMessage(), and AMOS::Overlap_t::readMessage(). |
|
Set the internal ID.
References AMOS::ID_t, and AMOS::IBankable_t::iid_m. |
|
Set the entire sequence. Combines and compresses the sequence and quality data contained in the two STL strings. If current Sequence object is compressed, please refer to the postconditions for the setBase(char,char,Pos_t) operation. All newline characters will be discarded. but the newlines must be in the same location in both the sequence and quality files.
References AMOS_THROW_ARGUMENT, isCompressed(), length_m, AMOS::NL_CHAR, AMOS::Pos_t, qual_m, AMOS::SafeRealloc(), seq_m, setBase(), and AMOS::Size_t. |
|
Set the entire sequence. Combines and compresses the sequence and quality data contained in the two C strings. If current Sequence object is compressed, please refer to the postconditions for the setBase(char,char,Pos_t) operation. All newline characters will be discarded, but the newlines must be in the same location in both the sequence and quality files.
References AMOS_THROW_ARGUMENT, isCompressed(), length_m, AMOS::NL_CHAR, AMOS::Pos_t, qual_m, AMOS::SafeRealloc(), seq_m, setBase(), and AMOS::Size_t. Referenced by readMessage(). |
|
Set the status value.
References AMOS::Status_t. Referenced by AMOS::Universal_t::readMessage(). |
|
Uncompress the internal representation of this sequence. After uncompression, this object will not compress incoming data until the compress method is called once again. The uncompressed version uses two bytes to store a base and quality score, thus doubling the memory requirements over a compressed version.
References COMPRESS_BIT, isCompressed(), length_m, AMOS::Pos_t, qual_m, AMOS::SafeRealloc(), and seq_m. Referenced by getBase(). |
|
Uncompresses a byte into a sequence and quality char.
References ADENINE_BITS, CYTOSINE_BITS, GUANINE_BITS, AMOS::Qual2Char(), QUAL_BITS, SEQ_BITS, and THYMINE_BITS. |
|
Writes data to a Message object. Writes the data contained in a Messagable object to a Message object. All previous data in the Message will be cleared or overwritten.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 234 of file Sequence_AMOS.cc. References CHARS_PER_LINE, clear(), AMOS::F_QUALITY, AMOS::F_SEQUENCE, getBase(), length_m, AMOS::NL_CHAR, AMOS::Pos_t, AMOS::Size_t, and AMOS::Universal_t::writeMessage(). Referenced by AMOS::Read_t::writeMessage(), and AMOS::Contig_t::writeMessage(). |
|
Write selected class members to a biserial record. Writes the fixed and variable length streams to a biserial record. Used in generating a biserial IBankable object, and needed to commit objects to a bank. Should not write the flags, EID, or IID of the object because the bank will handle the storage of these fields on its own.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 279 of file Sequence_AMOS.cc. References isCompressed(), length_m, qual_m, seq_m, AMOS::writeLE(), and AMOS::Universal_t::writeRecord(). Referenced by AMOS::Read_t::writeRecord(), and AMOS::Contig_t::writeRecord(). |
|
'A' bit Definition at line 47 of file Sequence_AMOS.hh. Referenced by compress(), and uncompress(). |
|
compressed sequence flag Definition at line 46 of file Sequence_AMOS.hh. Referenced by clear(), compress(), isCompressed(), and uncompress(). |
|
'C' bit Definition at line 48 of file Sequence_AMOS.hh. Referenced by compress(), and uncompress(). |
|
external ID (anything you want sans newlines) Definition at line 66 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::getEID(), and AMOS::IBankable_t::setEID(). |
|
bank flags, derived classes may use "nibble" Definition at line 68 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::isFlagA(), AMOS::IBankable_t::isFlagB(), AMOS::IBankable_t::isModified(), AMOS::IBankable_t::isRemoved(), AMOS::IBankable_t::setFlagA(), and AMOS::IBankable_t::setFlagB(). |
|
'G' bit Definition at line 49 of file Sequence_AMOS.hh. Referenced by compress(), and uncompress(). |
|
internal ID (integer AMOS identifier) Definition at line 64 of file Bank_AMOS.hh. Referenced by AMOS::IBankable_t::clear(), AMOS::IBankable_t::getIID(), AMOS::IBankable_t::IBankable_t(), and AMOS::IBankable_t::setIID(). |
|
length of the sequence and quality data Definition at line 43 of file Sequence_AMOS.hh. Referenced by clear(), compress(), getBase(), getLength(), getQualString(), getSeqString(), operator=(), readRecord(), Sequence_t(), setBase(), setSequence(), uncompress(), writeMessage(), and writeRecord(). |
|
The NCode type identifier for this object.
Reimplemented from AMOS::Universal_t.
Reimplemented in AMOS::Contig_t, and AMOS::Read_t. Definition at line 19 of file Sequence_AMOS.cc. |
|
quality bit mask Definition at line 52 of file Sequence_AMOS.hh. Referenced by uncompress(). |
|
uncompressed qual data Definition at line 42 of file Sequence_AMOS.hh. Referenced by clear(), compress(), getBase(), operator=(), readRecord(), Sequence_t(), setBase(), setSequence(), uncompress(), writeRecord(), and ~Sequence_t(). |
|
sequence bit mask Definition at line 51 of file Sequence_AMOS.hh. Referenced by compress(), and uncompress(). |
|
compressed seq and qual data or uncompressed seq Definition at line 41 of file Sequence_AMOS.hh. Referenced by clear(), compress(), getBase(), operator=(), readRecord(), Sequence_t(), setBase(), setSequence(), uncompress(), writeRecord(), and ~Sequence_t(). |
|
'T' bit Definition at line 50 of file Sequence_AMOS.hh. Referenced by compress(), and uncompress(). |