Sleipnir
Public Member Functions
Sleipnir::COrthology Class Reference

An orthology is a collection of sets, each containing zero or more orthologous genes from any organism. More...

#include <orthology.h>

Inheritance diagram for Sleipnir::COrthology:
Sleipnir::COrthologyImpl

Public Member Functions

bool Open (std::istream &istm)
 Loads a new orthology file from the given stream.
void Save (std::ostream &ostm) const
 Saves the current orthology to the given stream.
size_t GetClusters () const
 Return the number of clusters in the orthology.
size_t GetGenes (size_t iCluster) const
 Return the number of genes in the requested cluster.
CGeneGetGene (size_t iCluster, size_t iGene) const
 Return the gene at the given index within the given cluster.
size_t GetGenomes () const
 Return the number of genomes (organisms) in the orthology.
CGenomeGetGenome (size_t iOrganism) const
 Return the genome at the requested organism index.
const std::string & GetOrganism (size_t iOrganism) const
 Return the human-readable string identifier of the requested organism index.
size_t GetOrganism (const CGene &Gene) const
 Return the index of the organism whose genome contains the given gene, or -1 if none exists.

Detailed Description

An orthology is a collection of sets, each containing zero or more orthologous genes from any organism.

An orthology is a collection of sets referred to as orthologous clusters. Each such cluster is a set of genes; these genes can be drawn from any number of different organisms. Genes in an orthologous cluster are assumed (by some external agency) to perform similar functions in different organisms.

For example, for organisms O1, O2, and O3, an orthology might consist of clusters [O1:G1, O3:G2], [O1:G1, O1:G2], and [O1:G3, O2:G1, O3:G1]. That is, any cluster might contain any subset of genes from any subset of the available organisms, and any gene can appear in multiple clusters.

An orthology is stored in a tab-delimited text file in which each line represents a cluster and each tab-separated token indicates a gene and the organism it is drawn from. This file is of the form:

 O1:G1  O3:G2
 O1:G1  O1:G2
 O1:G3  O2:G1   O3:G1

Here, there are three organisms O1, O2, O3; O1 possesses three genes G1, G2, and G3, O2 has one gene G1, and O3 has two genes G1 and G2. The organism identifiers should be short human-recognizable organism identifiers (similar to those used by KEGG), and the gene identifiers should be standard unique primary name. A snippet of a real orthology file might resemble:

 DME|FBgn0034427    DME|FBgn0033431 MMU|MGI:104873  CEL|R04B3.2 HSA|AGA
 MMU|MGI:87963  HSA|AGT
 HSA|ALAD   SCE|YGL040C MMU|MGI:96853   DME|FBgn0036271
 DME|FBgn0036208    DME|FBgn0020764 HSA|GCAT    HSA|ALAS1   HSA|ALAS2   MMU|MGI:87989   CEL|T25B9.1 MMU|MGI:1349389 MMU|MGI:87990   SCE|YDR232W</pre>
Remarks:
A COrthology contains many CGenes sets, each containing CGene objects drawn from multiple underlying CGenome objects, one per organism of interest. The organism ID strings used in the orthology are retained and mapped to the underlying genomes.

Definition at line 64 of file orthology.h.


Member Function Documentation

size_t Sleipnir::COrthology::GetClusters ( ) const [inline]

Return the number of clusters in the orthology.

Returns:
Number of clusters in the orthology.
See also:
GetCluster

Definition at line 79 of file orthology.h.

CGene& Sleipnir::COrthology::GetGene ( size_t  iCluster,
size_t  iGene 
) const [inline]

Return the gene at the given index within the given cluster.

Parameters:
iClusterCluster from which gene should be returned.
iGeneIndex of gene to retrieve from cluster.
Returns:
Gene at the given index within the given clusterl.
Remarks:
For efficiency, no bounds checking is performed. The given values must be smaller than GetClusters and GetGenes.

Definition at line 117 of file orthology.h.

size_t Sleipnir::COrthology::GetGenes ( size_t  iCluster) const [inline]

Return the number of genes in the requested cluster.

Parameters:
iClusterCluster from which genes should be returned.
Returns:
Number of genes in the given cluster.
Remarks:
For efficiency, no bounds checking is performed. The given value must be smaller than GetClusters.

Definition at line 96 of file orthology.h.

CGenome* Sleipnir::COrthology::GetGenome ( size_t  iOrganism) const [inline]

Return the genome at the requested organism index.

Parameters:
iOrganismIndex of genome to return.
Returns:
Genome at the requested index.
Remarks:
For efficiency, no bounds checking is performed. The given value must be smaller than GetGenomes.

Definition at line 145 of file orthology.h.

size_t Sleipnir::COrthology::GetGenomes ( ) const [inline]

Return the number of genomes (organisms) in the orthology.

Returns:
Number of genomes in the orthology.

Definition at line 128 of file orthology.h.

const std::string& Sleipnir::COrthology::GetOrganism ( size_t  iOrganism) const [inline]

Return the human-readable string identifier of the requested organism index.

Parameters:
iOrganismIndex of organism ID to return.
Returns:
Organism ID at the requested index.
Remarks:
For efficiency, no bounds checking is performed. The given value must be smaller than GetGenomes.

Definition at line 162 of file orthology.h.

size_t Sleipnir::COrthology::GetOrganism ( const CGene Gene) const [inline]

Return the index of the organism whose genome contains the given gene, or -1 if none exists.

Parameters:
GeneGene whose organism index should be retrieved.
Returns:
Organism index associated with the given gene; -1 if none exists.
Remarks:
Gene comparison is done by pointer, so a different gene with the same primary identifier will not match. Gene/genome associations are maintained in a map by the orthology, so this lookup is fast.

Definition at line 181 of file orthology.h.

bool Sleipnir::COrthology::Open ( std::istream &  istm)

Loads a new orthology file from the given stream.

Parameters:
istmStream from which orthology file is loaded.
Returns:
True if the orthology was loaded successfully.
Remarks:
Constructs CGenome and CGene objects internally as necessary.
See also:
Save

Definition at line 59 of file orthology.cpp.

References Sleipnir::CGenome::AddGene(), and Sleipnir::CMeta::Tokenize().

void Sleipnir::COrthology::Save ( std::ostream &  ostm) const

Saves the current orthology to the given stream.

Parameters:
ostmStream into which orthology is saved.
See also:
Open

Definition at line 114 of file orthology.cpp.


The documentation for this class was generated from the following files: