Sleipnir
|
Represents a simple set of unique genes. More...
#include <genome.h>
Public Member Functions | |
CGenes (CGenome &Genome) | |
Construct a new gene set containing genomes drawn from the given underlying genome. | |
bool | Open (std::istream &istm, bool fCreate=true) |
Construct a new gene set by loading genes from the given text stream, one per line. | |
bool | Open (const std::vector< std::string > &vecstrGenes, bool fCreate=true) |
Construct a new gene set containing the given gene IDs. | |
bool | OpenWeighted (std::istream &istm, bool fCreate=true) |
Construct a new weighted gene set by loading genes from the given text stream, one per line. | |
void | Filter (const CGenes &GenesExclude) |
Remove the given genes from the gene set. | |
size_t | CountAnnotations (const IOntology *pOntology, size_t iTerm, bool fRecursive=true, const CGenes *pBackground=NULL) const |
Return the number of genes in the set annotated at or, optionally, below the given ontology term. | |
std::vector< std::string > | GetGeneNames () const |
Return the primary identifiers of all genes in the set. | |
bool | Open (const char *szFile, bool fCreate=true) |
Construct a new gene set by loading genes from the given text file, one per line. | |
bool | OpenWeighted (const char *szFile, bool fCreate=true) |
Construct a new weighted gene set by loading genes from the given text stream, one per line. | |
size_t | GetGenes () const |
Return the number of genes in the set. | |
bool | IsGene (const std::string &strGene) const |
Return true if the given name is a primary identifier of a gene in the set. | |
bool | IsWeighted () const |
Determine whether genes are weighted. | |
CGenome & | GetGenome () const |
Return the gene set's underlying genome. | |
const CGene & | GetGene (size_t iGene) const |
Return the gene at the requested index. | |
const float | GetGeneWeight (size_t iGene) const |
Return weight of the gene at the requested index. | |
size_t | GetGene (const std::string &strGene) const |
Return the index of the gene with the given primary identifier, or -1 if none exists. | |
bool | AddGene (const std::string &strGene) |
Adds a new gene with the given ID to the gene set. | |
Static Public Member Functions | |
static bool | Open (const char *szFile, CGenome &Genome, std::vector< std::string > &vecstrNames, std::vector< CGenes * > &vecpGenes) |
Simultaneously construct multiple new gene sets loaded from the given file, one per line, with tab-delimited genes. |
Represents a simple set of unique genes.
Sleipnir::CGenes::CGenes | ( | CGenome & | Genome | ) |
Construct a new gene set containing genomes drawn from the given underlying genome.
Genome | Genome containing all genes which might become members of this gene set. |
Definition at line 548 of file genome.cpp.
Referenced by Open().
bool Sleipnir::CGenes::AddGene | ( | const std::string & | strGene | ) | [inline] |
size_t Sleipnir::CGenes::CountAnnotations | ( | const IOntology * | pOntology, |
size_t | iTerm, | ||
bool | fRecursive = true , |
||
const CGenes * | pBackground = NULL |
||
) | const |
Return the number of genes in the set annotated at or, optionally, below the given ontology term.
pOntology | Ontology in which annotations are counted. |
iTerm | Ontology term at or below which annotations are counted. |
fRecursive | If true, count annotations at or below the given term; otherwise, count only direct annotations to the term. |
pBackground | If non-null, count only annotations for genes also contained in the given background set. |
Definition at line 711 of file genome.cpp.
References Sleipnir::IOntology::IsAnnotated(), and IsGene().
void Sleipnir::CGenes::Filter | ( | const CGenes & | GenesExclude | ) |
Remove the given genes from the gene set.
GenesExclude | Genes to be removed from the current gene set. |
Definition at line 769 of file genome.cpp.
References GetGene(), and GetGenes().
const CGene& Sleipnir::CGenes::GetGene | ( | size_t | iGene | ) | const [inline] |
Return the gene at the requested index.
iGene | Gene index to retrieve. |
Definition at line 512 of file genome.h.
Referenced by AddGene(), Sleipnir::CPCL::Distance(), Filter(), Sleipnir::CDat::FilterGenes(), Sleipnir::CDat::Open(), and Sleipnir::CDatasetCompact::Open().
size_t Sleipnir::CGenes::GetGene | ( | const std::string & | strGene | ) | const [inline] |
vector< string > Sleipnir::CGenes::GetGeneNames | ( | ) | const |
Return the primary identifiers of all genes in the set.
Definition at line 789 of file genome.cpp.
Referenced by Sleipnir::CPCL::Distance().
size_t Sleipnir::CGenes::GetGenes | ( | ) | const [inline] |
Return the number of genes in the set.
Definition at line 457 of file genome.h.
Referenced by Sleipnir::CPCL::Distance(), Filter(), Sleipnir::CDat::FilterGenes(), Sleipnir::CDataFilter::IsExample(), Sleipnir::CSVM::Learn(), Sleipnir::CDat::Open(), and Sleipnir::CDatasetCompact::Open().
const float Sleipnir::CGenes::GetGeneWeight | ( | size_t | iGene | ) | const [inline] |
Return weight of the gene at the requested index.
iGene | Gene index to retrieve. |
CGenome& Sleipnir::CGenes::GetGenome | ( | ) | const [inline] |
Return the gene set's underlying genome.
Definition at line 495 of file genome.h.
Referenced by Sleipnir::CSVM::Learn().
bool Sleipnir::CGenes::IsGene | ( | const std::string & | strGene | ) | const [inline] |
Return true if the given name is a primary identifier of a gene in the set.
strGene | Primary gene identifier for which the set is searched. |
Definition at line 474 of file genome.h.
Referenced by Sleipnir::CDataFilter::Attach(), CountAnnotations(), Sleipnir::CDat::Open(), and Sleipnir::CDatasetCompact::Open().
bool Sleipnir::CGenes::IsWeighted | ( | ) | const [inline] |
bool Sleipnir::CGenes::Open | ( | const char * | szFile, |
CGenome & | Genome, | ||
std::vector< std::string > & | vecstrNames, | ||
std::vector< CGenes * > & | vecpGenes | ||
) | [static] |
Simultaneously construct multiple new gene sets loaded from the given file, one per line, with tab-delimited genes.
szFile | File from which gene sets are loaded. |
Genome | Genome containing all genes which might become members of these gene sets. |
vecstrNames | Human-readable identifiers for the loaded gene sets. |
vecpGenes | Vector to which loaded gene sets are appended. |
Opens multiple gene sets from the given tab-delimited text file. Each line should contain a single tab-delimited gene set, and the first token on each line should be a human-readable identifier for that line's gene set.
Definition at line 507 of file genome.cpp.
References CGenes(), and Sleipnir::CMeta::Tokenize().
Referenced by Sleipnir::CPCL::Distance(), Sleipnir::CDat::FilterGenes(), Sleipnir::CDatasetCompact::FilterGenes(), and Open().
bool Sleipnir::CGenes::Open | ( | std::istream & | istm, |
bool | fCreate = true |
||
) |
Construct a new gene set by loading genes from the given text stream, one per line.
istm | Stream containing gene IDs to load, one per line. |
fCreate | If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored. |
Loads a text file of the form:
GENE1 GENE2 GENE3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.
Definition at line 578 of file genome.cpp.
References Sleipnir::CGenome::AddGene(), Sleipnir::CGenome::FindGene(), Sleipnir::CGenome::GetGene(), and Sleipnir::CGene::GetName().
bool Sleipnir::CGenes::Open | ( | const std::vector< std::string > & | vecstrGenes, |
bool | fCreate = true |
||
) |
Construct a new gene set containing the given gene IDs.
vecstrGenes | Primary identifiers of genes in the new gene set. |
fCreate | If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored. |
If the given gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.
Definition at line 742 of file genome.cpp.
References Sleipnir::CGenome::AddGene(), Sleipnir::CGenome::FindGene(), and Sleipnir::CGenome::GetGene().
bool Sleipnir::CGenes::Open | ( | const char * | szFile, |
bool | fCreate = true |
||
) | [inline] |
Construct a new gene set by loading genes from the given text file, one per line.
szFile | File containing gene IDs to load, one per line. |
fCreate | If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored. |
Loads a text file of the form:
GENE1 GENE2 GENE3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.
Definition at line 414 of file genome.h.
References Open().
bool Sleipnir::CGenes::OpenWeighted | ( | std::istream & | istm, |
bool | fCreate = true |
||
) |
Construct a new weighted gene set by loading genes from the given text stream, one per line.
istm | Stream containing gene IDs and corresponding weights to load, one per line. |
fCreate | If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored. |
Loads a text file of the form:
GENE1 WEIGHT1 GENE2 WEIGHT2 GENE3 WEIGHT3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.
Definition at line 639 of file genome.cpp.
References Sleipnir::CGenome::AddGene(), Sleipnir::CGenome::FindGene(), Sleipnir::CGenome::GetGene(), Sleipnir::CGene::GetName(), and Sleipnir::CMeta::Tokenize().
Referenced by OpenWeighted().
bool Sleipnir::CGenes::OpenWeighted | ( | const char * | szFile, |
bool | fCreate = true |
||
) | [inline] |
Construct a new weighted gene set by loading genes from the given text stream, one per line.
istm | Stream containing gene IDs and corresponding weights to load, one per line. |
fCreate | If true, add unknown genes to the underlying genome; otherwise, unknown gene IDs are ignored. |
Loads a text file of the form:
GENE1 WEIGHT1 GENE2 WEIGHT2 GENE3 WEIGHT3
containing one primary gene identifier per line. If these gene identifiers are found in the gene set's underlying genome, CGene objects are loaded from there. Otherwise, if fCreate is true, new genes are created from the loaded IDs. If fCreate is false, unrecognized genes are skipped with a warning.
Definition at line 445 of file genome.h.
References OpenWeighted().