|
Sleipnir
|
Organizes a collection of unique genes representing a background or maximum gene set for some situation. More...
#include <genome.h>
Public Member Functions | |
| bool | Open (std::istream &istmFeatures) |
| Construct a new genome by loading the SGD features file. | |
| bool | Open (const std::vector< std::string > &vecstrGenes) |
| Constructs a new genome containing the given gene IDs. | |
| bool | Open (const char *szFile, std::vector< CGenes * > &vecpGenes) |
| bool | Open (std::istream &istmGenes, std::vector< CGenes * > &vecpGenes) |
| CGene & | AddGene (const std::string &strID) |
| Adds a new gene with the given primary ID to the genome. | |
| size_t | FindGene (const std::string &strGene) const |
| Return the index of a gene within the genome, or -1 if it does not exist. | |
| std::vector< std::string > | GetGeneNames () const |
| Return a vector of all primary gene IDs in the genome. | |
| size_t | CountGenes (const IOntology *pOntology) const |
| Returns the number of genes in the genome with annotations in the given ontology. | |
| bool | AddSynonym (CGene &Gene, const std::string &strName) |
| Explicitly add a gene synonym to the gene and to the genome's name map. | |
| CGene & | GetGene (size_t iGene) const |
| Return the gene at the requested index within the genome. | |
| size_t | GetGene (const std::string &strGene) const |
| Return the index of the gene with the given name, or -1 if one cannot be found. | |
| size_t | GetGenes () const |
| Return the number of genes in the genome. | |
Organizes a collection of unique genes representing a background or maximum gene set for some situation.
Ideally, a genome represents a collection of all known genes for some organism, each with a single unique identifier and some number of non-overlapping synonyms. In practice, this doesn't happen: a genome often represents a background or comprehensive gene set for some situation (e.g. functional enrichment), or the total set of genes in some data file or analysis (e.g. a functional catalog). CGenome will do its best to disambiguate overlapping gene names, but it boils down to a simple one-to-one map, which will not deal with ambiguous synonyms accurately. For best results, guarantee that each gene has a unique primary identifier that does not overlap with any synonyms, and look up genes using only those identifiers.
| CGene & Sleipnir::CGenome::AddGene | ( | const std::string & | strID | ) |
Adds a new gene with the given primary ID to the genome.
| strID | Gene ID to be added to the genome. |
Given a gene name, AddGene will first test to see if any gene in the genome has that ID or synonym; if so, a reference to the existing gene is returned. Otherwise, an empty gene with the given primary ID is created, and a reference to this new gene is returned.
Definition at line 368 of file genome.cpp.
References GetGene().
Referenced by Sleipnir::CGenes::AddGene(), Sleipnir::COrthology::Open(), Open(), Sleipnir::CGenes::Open(), and Sleipnir::CGenes::OpenWeighted().
| bool Sleipnir::CGenome::AddSynonym | ( | CGene & | Gene, |
| const std::string & | strName | ||
| ) |
Explicitly add a gene synonym to the gene and to the genome's name map.
| Gene | Gene to which synonym is to be added. |
| strName | Synonym to be added to the given gene. |
Definition at line 474 of file genome.cpp.
References Sleipnir::CGene::AddSynonym(), and Sleipnir::CGene::GetName().
Referenced by Open().
| size_t Sleipnir::CGenome::CountGenes | ( | const IOntology * | pOntology | ) | const |
Returns the number of genes in the genome with annotations in the given ontology.
| pOntology | Ontology to be scanned for annotated genes. |
Definition at line 444 of file genome.cpp.
| size_t Sleipnir::CGenome::FindGene | ( | const std::string & | strGene | ) | const |
Return the index of a gene within the genome, or -1 if it does not exist.
| strGene | Name of gene to be retrieved from the genome. |
Search the genome's gene list for a gene with the given name, primary or synonymous, and return its index if found.
Definition at line 401 of file genome.cpp.
References GetGene(), GetGenes(), Sleipnir::CGene::GetSynonym(), and Sleipnir::CGene::GetSynonyms().
Referenced by Sleipnir::CGenes::Open(), and Sleipnir::CGenes::OpenWeighted().
| CGene& Sleipnir::CGenome::GetGene | ( | size_t | iGene | ) | const [inline] |
Return the gene at the requested index within the genome.
| iGene | Index of gene to retrieve. |
Definition at line 327 of file genome.h.
Referenced by AddGene(), FindGene(), Sleipnir::CGenes::Open(), Sleipnir::CGenes::OpenWeighted(), Sleipnir::CDat::SaveDOT(), and Sleipnir::CDat::SaveMATISSE().
| size_t Sleipnir::CGenome::GetGene | ( | const std::string & | strGene | ) | const [inline] |
Return the index of the gene with the given name, or -1 if one cannot be found.
| strGene | Name of gene whose index should be retrieved. |
| vector< string > Sleipnir::CGenome::GetGeneNames | ( | ) | const |
Return a vector of all primary gene IDs in the genome.
Definition at line 421 of file genome.cpp.
Referenced by Sleipnir::CDat::Open().
| size_t Sleipnir::CGenome::GetGenes | ( | ) | const [inline] |
Return the number of genes in the genome.
Definition at line 358 of file genome.h.
Referenced by FindGene().
| bool Sleipnir::CGenome::Open | ( | std::istream & | istmFeatures | ) |
Construct a new genome by loading the SGD features file.
| istmFeatures | Stream containing the SGD features information. |
Loads a (presumably yeast) genome from a file formatted as per the SGD features file (SGD_features.tab). This includes gene IDs, synonyms, glosses, and RNA and dubious tags.
Definition at line 252 of file genome.cpp.
References AddGene(), AddSynonym(), Sleipnir::CGene::SetDubious(), Sleipnir::CGene::SetGloss(), Sleipnir::CGene::SetRNA(), and Sleipnir::CMeta::Tokenize().
| bool Sleipnir::CGenome::Open | ( | const std::vector< std::string > & | vecstrGenes | ) |
Constructs a new genome containing the given gene IDs.
| vecstrGenes | Vector of gene IDs to add to the new genome. |
Definition at line 309 of file genome.cpp.
References AddGene().
1.7.6.1