Sleipnir
|
Implements IBayesNet for networks using custom node types. More...
#include <bayesnet.h>
Public Member Functions | |
bool | Open (const char *szFile) |
Load a Bayes net from a file. | |
bool | Save (const char *szFile) const |
Save a Bayes net to a file. | |
bool | Learn (const IDataset *pDataset, size_t iIterations, bool fZero=false, bool fELR=false) |
Learn conditional probabilities from data using Expectation Maximization, naive Bayesian learning, or Extended Logistic Regression. | |
bool | Evaluate (const std::vector< unsigned char > &vecbDatum, std::vector< float > &vecdResults, bool fZero=false, size_t iNode=0, bool fIgnoreMissing=false) const |
Perform Bayesian inference to obtain probabilities given values for each other Bayes net node. | |
void | GetNodes (std::vector< std::string > &vecstrNodes) const |
Retrieve the string IDs of all nodes in the Bayes net. | |
unsigned char | GetValues (size_t iNode) const |
Returns the number of different values taken by the requested node. | |
bool | IsContinuous () const |
Returns true if any node in the Bayes net is non-discrete (e.g. Gaussian, etc.) | |
bool | Evaluate (const IDataset *pDataset, std::vector< std::vector< float > > &vecvecdResults, bool fZero) const |
Perform Bayesian inference to obtain probabilities for each element of a dataset. | |
bool | Evaluate (const IDataset *pDataset, CDat &DatResults, bool fZero) const |
Perform Bayesian inference to obtain probabilities for each element of a dataset. | |
bool | IsContinuous (size_t iNode) const |
Returns true if the requested node is non-discrete (e.g. Gaussian, etc.) | |
void | Randomize () |
Randomizes every parameter in the Bayes net. | |
void | Randomize (size_t iNode) |
Randomizes every parameter the requested node. | |
void | Reverse (size_t iNode) |
Reverses the parameters of the requested node over its possible values. | |
bool | GetCPT (size_t iNode, CDataMatrix &MatCPT) const |
Retrieves the parameters of the requested Bayes net node. | |
bool | Evaluate (const CPCLPair &PCLData, CPCL &PCLResults, bool fZero, int iAlgorithm) const |
Perform Bayesian inference to obtain probabilities over all nodes in the network given some amount of data. |
Implements IBayesNet for networks using custom node types.
CBayesNetFN can be used to construct Bayes nets using arbitrary node types. These are usually only theoretically sound in a naive structure, but in such a case, any node type can be used for which parameters can be estimated from data: discrete, Gaussian, Beta, Exponential, etc. These networks are stored using a SMILE network in a DSL/XDSL file, but the semantics of each node's parameters are dependent on the node type.
Definition at line 149 of file bayesnet.h.
bool Sleipnir::CBayesNetFN::Evaluate | ( | const std::vector< unsigned char > & | vecbDatum, |
std::vector< float > & | vecdResults, | ||
bool | fZero = false , |
||
size_t | iNode = 0 , |
||
bool | fIgnoreMissing = false |
||
) | const [virtual] |
Perform Bayesian inference to obtain probabilities given values for each other Bayes net node.
vecbDatum | One-indexed values for each node in the Bayes net (zero indicates missing data). |
vecdResults | Inferred probabilities for each possible value of the requested node. |
fZero | If true, assume all missing values are zero (i.e. the first bin). |
iNode | The node for which output probabilities are inferred. |
fIgnoreMissing | If true, do not default missing values to zero or any other value. |
This Evaluate assumes a discrete Bayes net and, given a vector of evidence values for each node, infers the probability distribution over possible values of the requested node. Note that vecbDatum contains one plus the discrete bin value of each node, and a value of zero indicates missing data for the corresponding node.
Implements Sleipnir::IBayesNet.
Definition at line 604 of file bayesnetfn.cpp.
References Sleipnir::CMeta::GetNaN().
Referenced by Evaluate().
bool Sleipnir::CBayesNetFN::Evaluate | ( | const IDataset * | pDataset, |
std::vector< std::vector< float > > & | vecvecdResults, | ||
bool | fZero | ||
) | const [inline, virtual] |
Perform Bayesian inference to obtain probabilities for each element of a dataset.
pDataset | Dataset to be used as input for inference. |
vecvecdResults | Vector of output probabilities; each element of the outer vector represents the result for one gene pair, and each element of the inner vectors represents the probability for one possible value from the output node (i.e. the answer). |
fZero | If true, assume all missing values are zero (i.e. the first bin). |
The inverse of the corresponding IBayesNet::Learn method; given an IDataset, ignore the first (gold standard) dataset and infer the corresponding output probabilities for each other gene pair for which data is available. For each gene pair within the IDataset for which IDataset::IsExample is true, vecvecdResults will contain one vector. This vector will contain inferred probabilities for each possible value of the output node, generally the probability of functional unrelatedness (i.e. one minus the probability of functional relationship).
Implements Sleipnir::IBayesNet.
Definition at line 160 of file bayesnet.h.
References Evaluate().
bool Sleipnir::CBayesNetFN::Evaluate | ( | const IDataset * | pDataset, |
CDat & | DatResults, | ||
bool | fZero | ||
) | const [inline, virtual] |
Perform Bayesian inference to obtain probabilities for each element of a dataset.
pDataset | Dataset to be used as input for inference. |
DatResults | Description of parameter DatResults. |
fZero | If true, assume all missing values are zero (i.e. the first bin). |
The inverse of the corresponding IBayesNet::Learn method; given an IDataset, ignore the first (gold standard) dataset and infer the corresponding output probability for each other gene pair for which data is available. For each gene pair within the IDataset for which IDataset::IsExample is true, the probability of functional relationship (i.e. the largest possible value of the output node) will be placed in the given CDat.
Implements Sleipnir::IBayesNet.
Definition at line 165 of file bayesnet.h.
References Evaluate().
bool Sleipnir::CBayesNetFN::Evaluate | ( | const CPCLPair & | PCLData, |
CPCL & | PCLResults, | ||
bool | fZero, | ||
int | iAlgorithm | ||
) | const [inline, virtual] |
Perform Bayesian inference to obtain probabilities over all nodes in the network given some amount of data.
PCLData | Input data; each column (experiment) is mapped by label to a node in the Bayes net, and PCL entries correspond to observed (or missing) data values. |
PCLResults | Output probabilities; each column (experiment) is mapped to a node:value pair from the Bayes net, and PCL entries correspond to the probability of that value in that node. |
fZero | If true, assume all missing values are zero (i.e. the first bin). |
iAlgorithm | Implementation-specific ID of the Bayesian inference algorithm to use. |
This version of Evaluate will perform one Bayesian inference for each row (gene) of the given PCLData. Here, each PCL "experiment" column corresponds to a node in the Bayes net as identified by the experiment labels in the PCL and the IDs of the Bayes net nodes. Values are read from the given PCL and (if present; missing values are allowed) discretized into Bayes net value bins using the accompanying quantization information. For each input row, all given non-missing values are observed for the appropriate Bayes net nodes, and Bayesian inference is used to provide probabilities for each remaining, unobserved node value.
Implements Sleipnir::IBayesNet.
Definition at line 191 of file bayesnet.h.
bool Sleipnir::CBayesNetFN::GetCPT | ( | size_t | iNode, |
CDataMatrix & | MatCPT | ||
) | const [inline, virtual] |
Retrieves the parameters of the requested Bayes net node.
iNode | Index of node for which parameters should be retrieved. |
MatCPT | Parameters of the requested node in tabular form; the columns of the matrix represent parental values, the rows node values. |
Retrieves node parameters in an implementation-specific manner, often only allowing nodes with at most one parent. For discrete nodes, matrix entries are generally conditional probabilities. For continuous nodes, matrix entries may represent distribution parameters such as Gaussian mean and standard deviation.
Implements Sleipnir::IBayesNet.
Definition at line 187 of file bayesnet.h.
void Sleipnir::CBayesNetFN::GetNodes | ( | std::vector< std::string > & | vecstrNodes | ) | const [virtual] |
Retrieve the string IDs of all nodes in the Bayes net.
vecstrNodes | Output containing the IDs of all nodes in the Bayes net. |
Implements Sleipnir::IBayesNet.
Definition at line 641 of file bayesnetfn.cpp.
unsigned char Sleipnir::CBayesNetFN::GetValues | ( | size_t | iNode | ) | const [virtual] |
Returns the number of different values taken by the requested node.
iNode | Bayes net node for which values should be returned. |
Implements Sleipnir::IBayesNet.
Definition at line 647 of file bayesnetfn.cpp.
bool Sleipnir::CBayesNetFN::IsContinuous | ( | ) | const [virtual] |
Returns true if any node in the Bayes net is non-discrete (e.g. Gaussian, etc.)
Implements Sleipnir::IBayesNet.
Definition at line 653 of file bayesnetfn.cpp.
bool Sleipnir::CBayesNetFN::IsContinuous | ( | size_t | iNode | ) | const [inline, virtual] |
Returns true if the requested node is non-discrete (e.g. Gaussian, etc.)
iNode | Node to be inspected. |
Implements Sleipnir::IBayesNet.
Definition at line 169 of file bayesnet.h.
bool Sleipnir::CBayesNetFN::Learn | ( | const IDataset * | pDataset, |
size_t | iIterations, | ||
bool | fZero = false , |
||
bool | fELR = false |
||
) | [virtual] |
Learn conditional probabilities from data using Expectation Maximization, naive Bayesian learning, or Extended Logistic Regression.
pDataset | Dataset to be used for learning. |
iIterations | Maximum number of iterations for EM or ELR. |
fZero | If true, assume all missing values are zero (i.e. the first bin). |
fELR | If true, use ELR to learn network parameters. |
Using the given IDataset, learn parameters for the underlying Bayes network. If requested, learning is performed discriminatively using Extended Logistic Regression due to Greiner, Zhou, et al. Otherwise, maximum likelihood estimates are used for naive structures, and Expectation Maximization is used for other network structures.
Implements Sleipnir::IBayesNet.
Definition at line 481 of file bayesnetfn.cpp.
References Sleipnir::IDataset::GetDiscrete(), Sleipnir::IDataset::GetGenes(), and Sleipnir::IDataset::IsExample().
bool Sleipnir::CBayesNetFN::Open | ( | const char * | szFile | ) | [virtual] |
Load a Bayes net from a file.
szFile | Path to file. |
Implements Sleipnir::IBayesNet.
Definition at line 454 of file bayesnetfn.cpp.
void Sleipnir::CBayesNetFN::Randomize | ( | ) | [inline, virtual] |
Randomizes every parameter in the Bayes net.
Implements Sleipnir::IBayesNet.
Definition at line 173 of file bayesnet.h.
void Sleipnir::CBayesNetFN::Randomize | ( | size_t | iNode | ) | [inline, virtual] |
Randomizes every parameter the requested node.
iNode | Index of node to be randomized. |
Implements Sleipnir::IBayesNet.
Definition at line 179 of file bayesnet.h.
void Sleipnir::CBayesNetFN::Reverse | ( | size_t | iNode | ) | [inline, virtual] |
Reverses the parameters of the requested node over its possible values.
iNode | Index of node to be reversed. |
"Vertically" reverses the parameters of the requested node. That is, if the requested node can take values 0 through 3, then for each setting of the parents' values, Pnew(0|parents) = Pold(3|parents), Pnew(1|parents) = Pold(2|parents), Pnew(2|parents) = Pold(1|parents), and Pnew(3|parents) = Pold(0|parents).
Implements Sleipnir::IBayesNet.
Definition at line 183 of file bayesnet.h.
bool Sleipnir::CBayesNetFN::Save | ( | const char * | szFile | ) | const [virtual] |
Save a Bayes net to a file.
szFile | Path to file. |
Implements Sleipnir::IBayesNet.
Definition at line 469 of file bayesnetfn.cpp.