Message Types

From AMOS WIKI
Revision as of 02:13, 2 November 2009 by Mcschatz (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

AMOS 3-code message types

v1.3.0

NOTES:

  • See message_grammar.rtf for the message file format definition.
  • All fields are optional, but some programs might not like missing fields (e.g. a seq field without a qlt field).Empty fields are not allowed. If there is no data for a given filed, omit it from the message.
  • Acceptable field data is represented by Perl regular expressions. All regular expressions will be contained in parens () or brackets []. If uncontained, interpret characters as literal.
  • Field or message references are contained in <>.

Strict field ordering is not required. The ordering of fields in this definition is arbitrary.

  • Message inheritance is noted in C++ style. Fields inherited from a parent message will be listed but not described.
  • Ranges are specified as a pair of positions [x,y) where x is exclusive and y is inclusive. Thus, the range 4,6 would represent the 2 symbols at positions 4 and 5. Sequence positions are also indexed by this gap coordinate system, which essentially translates to a 0 based indexing scheme. e.g. the range [2,5) for the list 0,1,2,3,4,5,6 would define the sublist 2,3,4. Reversed ranges are also allowed, for example (5,2] would define the subset 4,3,2.


TYPES:

Universal_t : IBankable_t, IMessagable_t

{UNV
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
}
  • act - Action. [A]dd, [D]elete, [R]eplace message. If absent, default action will be addition.
  • iid – Internal (AMOS) ID. This integer ID must be unique among all objects of the same type. This is the ID used for all object links and thus is mandatory if other objects are to link to this one.
  • eid - External ID. This string ID must be unique among all objects of the same type. The ID may not contain any newlines, but may be any length.
  • com - Free-from comment field.
  • flg – Two generic boolean flags (A/B), default to zero if unspecified.
  • sts – Object status character.


Contig_t : Sequence_t

{CTG
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 seq:(\n(.*\n)*).
 qlt:(\n(.*\n)*).
 <TLE message>*
}
  • <TLE message> - Tiling of underlying reads.


ContigEdge_t : ContigLink_t, Edge_t

{CTE
 
act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+)
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
 lnk:(\n(<iid>\n)*)
}
  • obj – Removed. All nodes are Contig_t.


ContigLink_t : Link_t

{CTL
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+)
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
}
  • obj – Removed. All nodes are Contig_t.


Distribution_t : IMessagable_t

{DST
 mea:(\d+)
 std:(\d+)
}
  • mea - Mean.
  • std - Standard deviation.


Edge_t : Link_t

{EDG
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+)
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
 lnk:(\n(<iid>\n)*).
}
  • lnk - List of bundled links, referenced by their IIDs.


Feature_t : Universal_t

{FEA
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 clr:(\d+,\d+)
 typ:[RUJCOP.]
 src:<iid>,<message type>
} 
  • clr – Range/position of the feature.
  • typ – Feature type. [R]epeat, [U]nitig, [J]oin, [C]overage, [O]RF, [P]olymorphism.
  • src - Source of the feature, e.g. a contig, referenced by its IID and type.


Fragment_t : Universal_t

{FRG
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 lib:<iid>
 rds:<iid>,<iid>
 sze:(\d+)
 typ:[XBITW]
 src:<iid>,<message type>
} 
  • lib - Parent library, referenced by its IID.
  • rds – The paired sequencing reads, referenced by their IIDs.
  • sze - Size of the fragment, if known.
  • typ - Type of fragment. [X]Other, [B]AC, [I]nsert, [T]ransposon, [W]alk.
  • src - Source of this piece of DNA, e.g. a BAC fragment, referenced by its IID and type.


Group_t : Universal_t

{GRP
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 mbr:(\n(<iid>\n)*).
 obj:<message type>

}  
  • mbr - List of group members, referenced by IID.
  • obj - The object type of the members.


IDMap_t : IMessagable_t

{MAP
 sze:(\d+)
 map:(\n(<bid>\t<iid>\t<eid>\n)*).
 obj:<message type>
}

  • sze - Number of ID triples in the map.
  • map - List of ID triples, BID <-> IID <-> EID.
  • obj - The object type of the ID triples.


Index_t : Universal_t

{IDX
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 sze:(\d+)
 map:(\n(<iid>\t<iid>\n)*).
 obj:<message type>,<message type>
} 
  • sze - Number of ID pairs in the index.
  • map - List of ID pairs, IID -> IID
  • obj - The object type of the ID pairs.


Kmer_t : Universal_t

{KMR
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 cnt:(\d+)
 seq:([ACGT]+)
 rds:(\n(<iid>\n)*).
}
  • cnt - Number of occurrences of this Kmer.
  • seq - Sequence of this Kmer.
  • rds - List of reads that contain this Kmer, referenced by their IIDs.


Layout_t : Universal_t

{LAY
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 <TLE message>* 
}
  • <TLE message> - Tiling of underlying reads.


Library_t : Universal_t

{LIB
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 <DST message>
}
  • <DST message> - Library size distribution stats.


Link_t : Universal_t

{LNK
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+) 
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
}
  • nds – The linked nodes, referenced by their IIDs.
  • obj – The object type of the nodes.
  • adj - Node adjacency. [N]ormal, [A]nti-normal, [I]nnie, [O]utie which are EB, BE, EE, BB adjacencies respectively.
  • std - Standard deviation of the link size.
  • sze - Size of link.
  • typ - Type of link. [X]Other, [M]atepair, [O]verlap, [P]hysical, [A]lignment, [S]ynteny.
  • src - Source of the link, e.g. fragment information, referenced by its IID and type.


Overlap_t : Universal_t

{OVL
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 rds:<iid>,<iid>
 adj:[NAIO]
 ahg:(-?\d+)
 bhg:(-?\d+)
 scr:(\d+)
 flg:([01]{3})
}
  • rds – The overlapping reads, referenced by their IIDs.
  • adj - Read adjacency. [N]ormal, [A]nti-normal, [I]nnie, [O]utie which are EB, BE, EE, BB overlaps respectively.
  • ahg - Ahang. Length of the non-overlapping portion of the first read.
  • bhg - Bhang. Length of the non-overlapping portion of the second read.
  • scr – An unsigned integer overlap score.
  • flg – Universal_t flags plus one additional flag (A/B/C), default to zero if unspecified.


Read_t : Sequence_t

{RED
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 frg:<iid>
 typ:[XECBW]
 clr:(\d+,\d+)
 vcr:(\d+,\d+)
 qcr:(\d+,\d+)
 pos:(-?\d+)
 bcp:(\n(\d+\n)*).
}
  • frg - The parent fragment, referenced by its IID.
  • typ - Type of read. [X]Other, [E]nd, [C]ontig, [B]AC, [W]alk.
  • clr - The acting clear range.
  • vcr - Vector clear range.
  • qcr - Quality clear range.
  • pos - Approximate position on the parent fragment. Positive if counting from left and oriented forward, negative if counting from right and reverse orientated. 0 if unknown.
  • bcp – Absolute base call positions.


Scaffold_t : Universal_t

{SCF
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 edg:(\n(<iid>\n)*).
 <TLE message>*
}
  • edg - List of contig edges, referenced by their IIDs.
  • <TLE message> - Tiling of the underlying contigs.


ScaffoldEdge_t : ScaffoldLink_t, Edge_t

{SCE
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+)
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
 lnk:(\n(<iid>\n)*)
}
  • obj – Removed. All nodes are Scaffold_t.


ScaffoldLink_t : Link_t

{SCL
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 nds:<iid>,<iid>
 obj:<message type>
 adj:[NAOI]
 std:(\d+)
 sze:(-?\d+)
 typ:[XMOPAS]
 src:<iid>,<message type>
}
  • obj – Removed. All nodes are Scaffold_t.


Sequence_t : Universal_t

{SEQ
 act:[ADR]
 iid:(\d+)
 eid:(.+)
 com:(\n(.*\n)*).
 flg:([01]{2})
 sts:[.]
 seq:(\n(.*\n)*).
 qlt:(\n(.*\n)*).
}
  • seq - Sequence base call information.
  • qlt - Sequence quality information.


Tile_t : IMessageable_t

{TLE
 src:<iid>
 off:(-?\d+)
 clr:(\d+,\d+)
 gap:(\n(-?\d+\n)*).
} 
  • src - Tiled sequence, referenced by its IID. Type of sequence is implied by how this record is nested, e.g. a TLE in a CTG represents a RED, while a TLE in a SCF represents a CTG.
  • off - Offset of the tile from the beginning of the reference.
  • clr - Usable range of the tile, relative to the tile’s coordinates.
  • gap - List of delta encoded gap positions.