Sleipnir
|
Augments a dataset with a dynamically calculated gene set filter. More...
#include <dataset.h>
Public Member Functions | |
void | Attach (const IDataset *pDataset, const CGenes &Genes, CDat::EFilter eFilter, const CDat *pAnswers=NULL) |
Associates the data filter with the given dataset, gene set, and filter type. | |
bool | IsExample (size_t iY, size_t iX) const |
Returns true if some data file can be accessed at the requested position. | |
void | Remove (size_t iY, size_t iX) |
Remove all data for the given dataset position. | |
const std::vector< std::string > & | GetGeneNames () const |
Return a vector of all gene names in the dataset. | |
size_t | GetExperiments () const |
Return the number of experimental nodes in the dataset. | |
size_t | GetGene (const std::string &strGene) const |
Return the index of the given gene name, or -1 if it is not included in the dataset. | |
size_t | GetBins (size_t iNode) const |
Return the number of discrete values in the requested experimental node; -1 if the node is hidden or continuous. | |
size_t | GetGenes () const |
Returns the number of genes in the dataset. | |
bool | IsHidden (size_t iNode) const |
Returns true if the requested experimental node is hidden (does not correspond to a data file). | |
size_t | GetDiscrete (size_t iY, size_t iX, size_t iNode) const |
Return the discretized value at the requested position. | |
float | GetContinuous (size_t iY, size_t iX, size_t iNode) const |
Return the continuous value at the requested position. | |
const std::string & | GetGene (size_t iGene) const |
Returns the gene name at the requested index. | |
void | FilterGenes (const CGenes &Genes, CDat::EFilter eFilter) |
Remove values from the dataset based on the given gene set and filter type. | |
void | Save (std::ostream &ostm, bool fBinary) const |
Save a dataset to the given stream in binary or tabular (human readable) form. |
Augments a dataset with a dynamically calculated gene set filter.
A data filter wraps an underlying dataset with a dynamically calculated filter using a gene set and CDat::EFilter type. A filtered gene pair will return false from IsExample and act like missing data. Unfiltered gene pairs will be retrieved from the underlying dataset. This allows data to be temporarily hidden without modifying the underlying dataset.
void Sleipnir::CDataFilter::Attach | ( | const IDataset * | pDataset, |
const CGenes & | Genes, | ||
CDat::EFilter | eFilter, | ||
const CDat * | pAnswers = NULL |
||
) |
Associates the data filter with the given dataset, gene set, and filter type.
pDataset | Dataset to be associated with the overlaying mask. |
Genes | Gene set used to filter the dataset. |
eFilter | Way in which to use the given genes to remove gene pairs. |
pAnswers | If non-null, answer set to be used for filter types requiring answers (e.g. CDat::EFilterTerm). |
Definition at line 617 of file dataset.cpp.
References Sleipnir::CDat::GetGene(), GetGene(), GetGenes(), and Sleipnir::CGenes::IsGene().
void Sleipnir::CDataFilter::FilterGenes | ( | const CGenes & | Genes, |
CDat::EFilter | eFilter | ||
) | [inline, virtual] |
Remove values from the dataset based on the given gene set and filter type.
Genes | Gene set used to filter the dataset. |
eFilter | Way in which to use the given genes to remove values. |
Remove values and genes (by removing all incident edges) from the dataset based on one of several algorithms. For details, see CDat::EFilter.
Implements Sleipnir::IDataset.
size_t Sleipnir::CDataFilter::GetBins | ( | size_t | iNode | ) | const [inline, virtual] |
Return the number of discrete values in the requested experimental node; -1 if the node is hidden or continuous.
iNode | Experimental node for which bin number should be returned. |
Implements Sleipnir::IDataset.
float Sleipnir::CDataFilter::GetContinuous | ( | size_t | iY, |
size_t | iX, | ||
size_t | iNode | ||
) | const [inline, virtual] |
Return the continuous value at the requested position.
iY | Data row. |
iX | Data column. |
iNode | Experimental node from which to retrieve the requested pair's value. |
Implements Sleipnir::IDataset.
Definition at line 773 of file dataset.h.
References Sleipnir::CMeta::GetNaN(), and IsExample().
size_t Sleipnir::CDataFilter::GetDiscrete | ( | size_t | iY, |
size_t | iX, | ||
size_t | iNode | ||
) | const [inline, virtual] |
Return the discretized value at the requested position.
iY | Data row. |
iX | Data column. |
iNode | Experimental node from which to retrieve the requested pair's value. |
Implements Sleipnir::IDataset.
Definition at line 769 of file dataset.h.
References IsExample().
size_t Sleipnir::CDataFilter::GetExperiments | ( | ) | const [inline, virtual] |
Return the number of experimental nodes in the dataset.
Implements Sleipnir::IDataset.
size_t Sleipnir::CDataFilter::GetGene | ( | const std::string & | strGene | ) | const [inline, virtual] |
Return the index of the given gene name, or -1 if it is not included in the dataset.
strGene | Gene name to retrieve. |
Implements Sleipnir::IDataset.
const std::string& Sleipnir::CDataFilter::GetGene | ( | size_t | iGene | ) | const [inline, virtual] |
Returns the gene name at the requested index.
iGene | Index of gene name to return. |
Implements Sleipnir::IDataset.
Definition at line 778 of file dataset.h.
References GetGene().
const std::vector<std::string>& Sleipnir::CDataFilter::GetGeneNames | ( | ) | const [inline, virtual] |
Return a vector of all gene names in the dataset.
Implements Sleipnir::IDataset.
size_t Sleipnir::CDataFilter::GetGenes | ( | ) | const [inline, virtual] |
Returns the number of genes in the dataset.
Implements Sleipnir::IDataset.
Definition at line 761 of file dataset.h.
Referenced by Attach().
bool Sleipnir::CDataFilter::IsExample | ( | size_t | iY, |
size_t | iX | ||
) | const [virtual] |
Returns true if some data file can be accessed at the requested position.
iY | Data row. |
iX | Data column. |
A dataset position is a usable example if at least one data file can be accessed at that position; that is, if some data file provides a non-missing value for that gene pair. Implementations that filter pairs in some manner can also prevent particular positions from being usable examples.
Implements Sleipnir::IDataset.
Definition at line 634 of file dataset.cpp.
References Sleipnir::CDat::EFilterEdge, Sleipnir::CDat::EFilterExclude, Sleipnir::CDat::EFilterInclude, Sleipnir::CDat::EFilterTerm, Sleipnir::CDat::Get(), Sleipnir::CGenes::GetGenes(), and Sleipnir::IDataset::IsExample().
Referenced by GetContinuous(), and GetDiscrete().
bool Sleipnir::CDataFilter::IsHidden | ( | size_t | iNode | ) | const [inline, virtual] |
Returns true if the requested experimental node is hidden (does not correspond to a data file).
iNode | Experimental node to investigate. |
Since a dataset can be constructed either directly on a collection of data files or by tying a model such as a Bayes net to data files, IDataset can determine which model nodes are hidden by testing whether a data file exists for them. If no such file exists, the node is hidden and, for example, can be treated specially during Bayesian learning.
Implements Sleipnir::IDataset.
void Sleipnir::CDataFilter::Remove | ( | size_t | iY, |
size_t | iX | ||
) | [inline, virtual] |
Remove all data for the given dataset position.
iY | Data row. |
iX | Data column. |
Unloads or masks data from all encapsulated files for the requested gene pair.
Implements Sleipnir::IDataset.
void Sleipnir::CDataFilter::Save | ( | std::ostream & | ostm, |
bool | fBinary | ||
) | const [inline, virtual] |
Save a dataset to the given stream in binary or tabular (human readable) form.
ostm | Stream into which dataset is saved. |
fBinary | If true, save the dataset as a binary file; if false, save it as a text-based tab-delimited file. |
Implements Sleipnir::IDataset.