Sleipnir
Public Member Functions
Sleipnir::IBayesNet Class Reference

Encapsulates a Bayesian network with arbitrary structure and node types. More...

#include <bayesnetint.h>

Inheritance diagram for Sleipnir::IBayesNet:
Sleipnir::CBayesNetFN Sleipnir::CBayesNetSmile

Public Member Functions

virtual bool Open (const char *szFile)=0
 Load a Bayes net from a file.
virtual bool Save (const char *szFile) const =0
 Save a Bayes net to a file.
virtual bool Learn (const IDataset *pDataset, size_t iIterations, bool fZero=false, bool fELR=false)=0
 Learn conditional probabilities from data using Expectation Maximization, naive Bayesian learning, or Extended Logistic Regression.
virtual bool Evaluate (const IDataset *pDataset, std::vector< std::vector< float > > &vecvecdResults, bool fZero=false) const =0
 Perform Bayesian inference to obtain probabilities for each element of a dataset.
virtual bool Evaluate (const IDataset *pDataset, CDat &DatResults, bool fZero=false) const =0
 Perform Bayesian inference to obtain probabilities for each element of a dataset.
virtual bool Evaluate (const std::vector< unsigned char > &vecbDatum, std::vector< float > &vecdResults, bool fZero=false, size_t iNode=0, bool fIgnoreMissing=false) const =0
 Perform Bayesian inference to obtain probabilities given values for each other Bayes net node.
virtual bool Evaluate (const CPCLPair &PCLData, CPCL &PCLResults, bool fZero=false, int iAlgorithm=-1) const =0
 Perform Bayesian inference to obtain probabilities over all nodes in the network given some amount of data.
virtual void GetNodes (std::vector< std::string > &vecstrNodes) const =0
 Retrieve the string IDs of all nodes in the Bayes net.
virtual unsigned char GetValues (size_t iNode) const =0
 Returns the number of different values taken by the requested node.
virtual bool IsContinuous () const =0
 Returns true if any node in the Bayes net is non-discrete (e.g. Gaussian, etc.)
virtual bool IsContinuous (size_t iNode) const =0
 Returns true if the requested node is non-discrete (e.g. Gaussian, etc.)
virtual void Randomize ()=0
 Randomizes every parameter in the Bayes net.
virtual void Randomize (size_t iNode)=0
 Randomizes every parameter the requested node.
virtual void Reverse (size_t iNode)=0
 Reverses the parameters of the requested node over its possible values.
virtual bool GetCPT (size_t iNode, CDataMatrix &MatCPT) const =0
 Retrieves the parameters of the requested Bayes net node.

Detailed Description

Encapsulates a Bayesian network with arbitrary structure and node types.

IBayesNet provides an interface for Bayesian graphical models. These can have an arbitrary graph structure, and implementations of the interface can provide arbitrary node types. Inference and parameter learning functions are exposed that operate directly on Sleipnir datatypes such as IDataset, CDat, and CPCL. The only strict requirement of nodes is that they provide unique string labels and can expose their parameters by way of a CDataMatrix, although the semantics of those parameters are not constrained.

Definition at line 45 of file bayesnetint.h.


Member Function Documentation

virtual bool Sleipnir::IBayesNet::Evaluate ( const IDataset pDataset,
std::vector< std::vector< float > > &  vecvecdResults,
bool  fZero = false 
) const [pure virtual]

Perform Bayesian inference to obtain probabilities for each element of a dataset.

Parameters:
pDatasetDataset to be used as input for inference.
vecvecdResultsVector of output probabilities; each element of the outer vector represents the result for one gene pair, and each element of the inner vectors represents the probability for one possible value from the output node (i.e. the answer).
fZeroIf true, assume all missing values are zero (i.e. the first bin).
Returns:
True if evaluation was successful.

The inverse of the corresponding IBayesNet::Learn method; given an IDataset, ignore the first (gold standard) dataset and infer the corresponding output probabilities for each other gene pair for which data is available. For each gene pair within the IDataset for which IDataset::IsExample is true, vecvecdResults will contain one vector. This vector will contain inferred probabilities for each possible value of the output node, generally the probability of functional unrelatedness (i.e. one minus the probability of functional relationship).

Remarks:
The order of datasets in the given IDataset must correspond to the order of nodes within the Bayes network, and the first dataset (index 0) is assumed to be a gold standard (and is thus ignored). Only data for which IDataset::IsExample is true will be used, which usually means that at least one other dataset must have a value. If the output node can take N values, each output vector will contain only the first N-1 probabilities, since the Nth can be calculated to sum to one.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Evaluate ( const IDataset pDataset,
CDat DatResults,
bool  fZero = false 
) const [pure virtual]

Perform Bayesian inference to obtain probabilities for each element of a dataset.

Parameters:
pDatasetDataset to be used as input for inference.
DatResultsDescription of parameter DatResults.
fZeroIf true, assume all missing values are zero (i.e. the first bin).
Returns:
True if evaluation was successful.

The inverse of the corresponding IBayesNet::Learn method; given an IDataset, ignore the first (gold standard) dataset and infer the corresponding output probability for each other gene pair for which data is available. For each gene pair within the IDataset for which IDataset::IsExample is true, the probability of functional relationship (i.e. the largest possible value of the output node) will be placed in the given CDat.

Remarks:
The order of datasets in the given IDataset must correspond to the order of nodes within the Bayes network, and the first dataset (index 0) is assumed to be a gold standard (and is thus ignored). Only data for which IDataset::IsExample is true will be used, which usually means that at least one other dataset must have a value.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Evaluate ( const std::vector< unsigned char > &  vecbDatum,
std::vector< float > &  vecdResults,
bool  fZero = false,
size_t  iNode = 0,
bool  fIgnoreMissing = false 
) const [pure virtual]

Perform Bayesian inference to obtain probabilities given values for each other Bayes net node.

Parameters:
vecbDatumOne-indexed values for each node in the Bayes net (zero indicates missing data).
vecdResultsInferred probabilities for each possible value of the requested node.
fZeroIf true, assume all missing values are zero (i.e. the first bin).
iNodeThe node for which output probabilities are inferred.
fIgnoreMissingIf true, do not default missing values to zero or any other value.
Returns:
True if evaluation was successful.

This Evaluate assumes a discrete Bayes net and, given a vector of evidence values for each node, infers the probability distribution over possible values of the requested node. Note that vecbDatum contains one plus the discrete bin value of each node, and a value of zero indicates missing data for the corresponding node.

Remarks:
vecbDatum should contain one plus the discrete bin value of each node, and a value of zero indicates missing data for the corresponding node. If the requested output node can take N values, the output vector will contain only the first N-1 probabilities, since the Nth can be calculated to sum to one.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Evaluate ( const CPCLPair PCLData,
CPCL PCLResults,
bool  fZero = false,
int  iAlgorithm = -1 
) const [pure virtual]

Perform Bayesian inference to obtain probabilities over all nodes in the network given some amount of data.

Parameters:
PCLDataInput data; each column (experiment) is mapped by label to a node in the Bayes net, and PCL entries correspond to observed (or missing) data values.
PCLResultsOutput probabilities; each column (experiment) is mapped to a node:value pair from the Bayes net, and PCL entries correspond to the probability of that value in that node.
fZeroIf true, assume all missing values are zero (i.e. the first bin).
iAlgorithmImplementation-specific ID of the Bayesian inference algorithm to use.
Returns:
True if evaluation was successful.

This version of Evaluate will perform one Bayesian inference for each row (gene) of the given PCLData. Here, each PCL "experiment" column corresponds to a node in the Bayes net as identified by the experiment labels in the PCL and the IDs of the Bayes net nodes. Values are read from the given PCL and (if present; missing values are allowed) discretized into Bayes net value bins using the accompanying quantization information. For each input row, all given non-missing values are observed for the appropriate Bayes net nodes, and Bayesian inference is used to provide probabilities for each remaining, unobserved node value.

Remarks:
PCLResults must be initialized with the correct number of experimental columns before calling Evaluate; that is, the total number of node values in the Bayes net. For example, if the Bayes net has three nodes A, B, and C, node A can take two values 0 and 1, and nodes B and C can take values 0, 1, and 2, then PCLResults must have 8 experimental columns corresponding to A:0, A:1, B:0, B:1, B:2, C:0, C:1, and C:2. Columns of PCLData are mapped to Bayes net nodes by experiment and node labels; experiment labels not corresponding to any Bayes net node ID are ignored, and Bayes net nodes with no corresponding experiment are assumed to be unobserved (hidden). Only the genes in PCLResults are used, and they need not be in the same order as in PCLData.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::GetCPT ( size_t  iNode,
CDataMatrix MatCPT 
) const [pure virtual]

Retrieves the parameters of the requested Bayes net node.

Parameters:
iNodeIndex of node for which parameters should be retrieved.
MatCPTParameters of the requested node in tabular form; the columns of the matrix represent parental values, the rows node values.
Returns:
True if parameter retrieval succeeded, false if it failed or the requested node has more than one parent.

Retrieves node parameters in an implementation-specific manner, often only allowing nodes with at most one parent. For discrete nodes, matrix entries are generally conditional probabilities. For continuous nodes, matrix entries may represent distribution parameters such as Gaussian mean and standard deviation.

Remarks:
Only allowed for nodes with at most one parent; nodes with more parents are supported by some implementations, but their parameters can't be retrieved by this function.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual void Sleipnir::IBayesNet::GetNodes ( std::vector< std::string > &  vecstrNodes) const [pure virtual]

Retrieve the string IDs of all nodes in the Bayes net.

Parameters:
vecstrNodesOutput containing the IDs of all nodes in the Bayes net.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

Referenced by Sleipnir::CDataSubset::Initialize(), and Sleipnir::CDatasetCompact::Open().

virtual unsigned char Sleipnir::IBayesNet::GetValues ( size_t  iNode) const [pure virtual]

Returns the number of different values taken by the requested node.

Parameters:
iNodeBayes net node for which values should be returned.
Returns:
Number of different values taken by the requested node.
Remarks:
Not applicable for continuous nodes.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::IsContinuous ( ) const [pure virtual]

Returns true if any node in the Bayes net is non-discrete (e.g. Gaussian, etc.)

Returns:
True if any node in the Bayes net is continuous.

Implemented in Sleipnir::CBayesNetFN.

Referenced by Sleipnir::CDataSubset::Initialize(), Sleipnir::CDataset::Open(), and Sleipnir::CDatasetCompact::Open().

virtual bool Sleipnir::IBayesNet::IsContinuous ( size_t  iNode) const [pure virtual]

Returns true if the requested node is non-discrete (e.g. Gaussian, etc.)

Parameters:
iNodeNode to be inspected.
Returns:
True if the requested node is continuous.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Learn ( const IDataset pDataset,
size_t  iIterations,
bool  fZero = false,
bool  fELR = false 
) [pure virtual]

Learn conditional probabilities from data using Expectation Maximization, naive Bayesian learning, or Extended Logistic Regression.

Parameters:
pDatasetDataset to be used for learning.
iIterationsMaximum number of iterations for EM or ELR.
fZeroIf true, assume all missing values are zero (i.e. the first bin).
fELRIf true, use ELR to learn network parameters.
Returns:
True if parameters were learned successfully.

Using the given IDataset, learn parameters for the underlying Bayes network. If requested, learning is performed discriminatively using Extended Logistic Regression due to Greiner, Zhou, et al. Otherwise, maximum likelihood estimates are used for naive structures, and Expectation Maximization is used for other network structures.

Remarks:
The order of datasets in the given IDataset must correspond to the order of nodes within the Bayes network, and the first dataset (index 0) is assumed to be a gold standard. Only data for which IDataset::IsExample is true will be used, which usually means that the first dataset and at least one other dataset must have a value.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Open ( const char *  szFile) [pure virtual]

Load a Bayes net from a file.

Parameters:
szFilePath to file.
Returns:
True if Bayes net was loaded succesfully.
Remarks:
Specific behavior is implementation specific; it is assumed that the network will be completely reinitialized from the given file, although it may be left in an inconsistent state if the return value is false.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual void Sleipnir::IBayesNet::Randomize ( ) [pure virtual]

Randomizes every parameter in the Bayes net.

Remarks:
Parameter values are generated uniformly at random and normalized to represent a valid probability distribution.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual void Sleipnir::IBayesNet::Randomize ( size_t  iNode) [pure virtual]

Randomizes every parameter the requested node.

Parameters:
iNodeIndex of node to be randomized.
Remarks:
Parameter values are generated uniformly at random and normalized to represent a valid probability distribution.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual void Sleipnir::IBayesNet::Reverse ( size_t  iNode) [pure virtual]

Reverses the parameters of the requested node over its possible values.

Parameters:
iNodeIndex of node to be reversed.

"Vertically" reverses the parameters of the requested node. That is, if the requested node can take values 0 through 3, then for each setting of the parents' values, Pnew(0|parents) = Pold(3|parents), Pnew(1|parents) = Pold(2|parents), Pnew(2|parents) = Pold(1|parents), and Pnew(3|parents) = Pold(0|parents).

Remarks:
May be ignored by some implementations, particularly continuously valued nodes.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.

virtual bool Sleipnir::IBayesNet::Save ( const char *  szFile) const [pure virtual]

Save a Bayes net to a file.

Parameters:
szFilePath to file.
Returns:
True if Bayes net was saved succesfully.
Remarks:
Specific behavior is implementation specific; the Bayes net will not be modified, but the contents of the output file may be inconsistent if the return value is false.

Implemented in Sleipnir::CBayesNetFN, and Sleipnir::CBayesNetSmile.


The documentation for this class was generated from the following file: