Sleipnir
|
An orthology is a collection of sets, each containing zero or more orthologous genes from any organism. More...
#include <orthology.h>
Public Member Functions | |
bool | Open (std::istream &istm) |
Loads a new orthology file from the given stream. | |
void | Save (std::ostream &ostm) const |
Saves the current orthology to the given stream. | |
size_t | GetClusters () const |
Return the number of clusters in the orthology. | |
size_t | GetGenes (size_t iCluster) const |
Return the number of genes in the requested cluster. | |
CGene & | GetGene (size_t iCluster, size_t iGene) const |
Return the gene at the given index within the given cluster. | |
size_t | GetGenomes () const |
Return the number of genomes (organisms) in the orthology. | |
CGenome * | GetGenome (size_t iOrganism) const |
Return the genome at the requested organism index. | |
const std::string & | GetOrganism (size_t iOrganism) const |
Return the human-readable string identifier of the requested organism index. | |
size_t | GetOrganism (const CGene &Gene) const |
Return the index of the organism whose genome contains the given gene, or -1 if none exists. |
An orthology is a collection of sets, each containing zero or more orthologous genes from any organism.
An orthology is a collection of sets referred to as orthologous clusters. Each such cluster is a set of genes; these genes can be drawn from any number of different organisms. Genes in an orthologous cluster are assumed (by some external agency) to perform similar functions in different organisms.
For example, for organisms O1, O2, and O3, an orthology might consist of clusters [O1:G1, O3:G2], [O1:G1, O1:G2], and [O1:G3, O2:G1, O3:G1]. That is, any cluster might contain any subset of genes from any subset of the available organisms, and any gene can appear in multiple clusters.
An orthology is stored in a tab-delimited text file in which each line represents a cluster and each tab-separated token indicates a gene and the organism it is drawn from. This file is of the form:
O1:G1 O3:G2 O1:G1 O1:G2 O1:G3 O2:G1 O3:G1
Here, there are three organisms O1, O2, O3; O1 possesses three genes G1, G2, and G3, O2 has one gene G1, and O3 has two genes G1 and G2. The organism identifiers should be short human-recognizable organism identifiers (similar to those used by KEGG), and the gene identifiers should be standard unique primary name. A snippet of a real orthology file might resemble:
DME|FBgn0034427 DME|FBgn0033431 MMU|MGI:104873 CEL|R04B3.2 HSA|AGA MMU|MGI:87963 HSA|AGT HSA|ALAD SCE|YGL040C MMU|MGI:96853 DME|FBgn0036271 DME|FBgn0036208 DME|FBgn0020764 HSA|GCAT HSA|ALAS1 HSA|ALAS2 MMU|MGI:87989 CEL|T25B9.1 MMU|MGI:1349389 MMU|MGI:87990 SCE|YDR232W</pre>
Definition at line 64 of file orthology.h.
size_t Sleipnir::COrthology::GetClusters | ( | ) | const [inline] |
Return the number of clusters in the orthology.
Definition at line 79 of file orthology.h.
CGene& Sleipnir::COrthology::GetGene | ( | size_t | iCluster, |
size_t | iGene | ||
) | const [inline] |
Return the gene at the given index within the given cluster.
iCluster | Cluster from which gene should be returned. |
iGene | Index of gene to retrieve from cluster. |
Definition at line 117 of file orthology.h.
size_t Sleipnir::COrthology::GetGenes | ( | size_t | iCluster | ) | const [inline] |
Return the number of genes in the requested cluster.
iCluster | Cluster from which genes should be returned. |
Definition at line 96 of file orthology.h.
CGenome* Sleipnir::COrthology::GetGenome | ( | size_t | iOrganism | ) | const [inline] |
Return the genome at the requested organism index.
iOrganism | Index of genome to return. |
Definition at line 145 of file orthology.h.
size_t Sleipnir::COrthology::GetGenomes | ( | ) | const [inline] |
Return the number of genomes (organisms) in the orthology.
Definition at line 128 of file orthology.h.
const std::string& Sleipnir::COrthology::GetOrganism | ( | size_t | iOrganism | ) | const [inline] |
Return the human-readable string identifier of the requested organism index.
iOrganism | Index of organism ID to return. |
Definition at line 162 of file orthology.h.
size_t Sleipnir::COrthology::GetOrganism | ( | const CGene & | Gene | ) | const [inline] |
Return the index of the organism whose genome contains the given gene, or -1 if none exists.
Gene | Gene whose organism index should be retrieved. |
Definition at line 181 of file orthology.h.
bool Sleipnir::COrthology::Open | ( | std::istream & | istm | ) |
Loads a new orthology file from the given stream.
istm | Stream from which orthology file is loaded. |
Definition at line 59 of file orthology.cpp.
References Sleipnir::CGenome::AddGene(), and Sleipnir::CMeta::Tokenize().
void Sleipnir::COrthology::Save | ( | std::ostream & | ostm | ) | const |
Saves the current orthology to the given stream.
ostm | Stream into which orthology is saved. |
Definition at line 114 of file orthology.cpp.