Sleipnir
|
Manages a single converging regulatory module for CCoalesce. More...
#include <coalescecluster.h>
Public Member Functions | |
bool | Initialize (const CPCL &PCL, CCoalesceCluster &Pot, const std::vector< SCoalesceDataset > &vecsDatasets, std::set< std::pair< size_t, size_t > > &setpriiSeeds, const std::vector< float > &vecdSeed, size_t iPairs, float dPValue, float dProbability, size_t iThreads) |
Randomly initializes a new cluster from the given PCL by selecting a pair of correlated genes and a surrounding seed of additional genes. | |
void | Subtract (CPCL &PCL, const CCoalesceCluster &Pot) const |
Subtract the average expression value for each condition in the cluster from each gene in the cluster. | |
void | Subtract (CCoalesceGeneScores &GeneScores) const |
Subtract the average frequency of each motif in the cluster from the score for that motif in each gene in the cluster. | |
bool | SelectConditions (const CPCL &PCL, const CCoalesceCluster &Pot, size_t iThreads, float dPValue, float dZScore) |
Performs feature selection to include significant expression conditions in a converging cluster. | |
bool | SelectMotifs (const CCoalesceGroupHistograms &HistsCluster, const CCoalesceGroupHistograms &HistsPot, float dPValue, float dZScore, size_t iMaxMotifs, size_t iThreads, const CCoalesceMotifLibrary *pMotifs=NULL) |
Performs feature selection to include significant sequence motifs in a converging cluster. | |
bool | SelectGenes (const CPCL &PCL, const CCoalesceGeneScores &GeneScores, const CCoalesceGroupHistograms &HistsCluster, const CCoalesceGroupHistograms &HistsPot, size_t iMinimum, size_t iThreads, CCoalesceCluster &Pot, float dProbability, const CCoalesceMotifLibrary *pMotifs=NULL) |
Add and remove (im)probable genes to a converging cluster using Bayesian integration based on the currently selected expression conditions and sequence motifs. | |
void | CalculateHistograms (const CCoalesceGeneScores &GeneScores, CCoalesceGroupHistograms &HistogramsCluster, CCoalesceGroupHistograms *pHistogramsPot) const |
Updates the given motif score histograms based on the genes in the current cluster using the provided per-gene motif scores. | |
size_t | Open (const std::string &strPCL, size_t iSkip, const CPCL &PCL, CCoalesceMotifLibrary *pMotifs=NULL) |
Opens a cluster based on the given PCL file and, if present, accompanying motif file. | |
size_t | Open (std::istream &istm, const CPCL &PCL, CCoalesceMotifLibrary *pMotifs=NULL) |
Opens a cluster based on the textual representation (including genes, conditions, and motifs) in the given input stream. | |
bool | Open (const CHierarchy &Hierarchy, const std::vector< CCoalesceCluster > &vecClusters, const std::vector< std::string > &vecstrClusters, float dFraction, float dCutoff, size_t iCutoff, CCoalesceMotifLibrary *pMotifs=NULL) |
Creates a new cluster by merging the one or more clusters in the given hierarchy. | |
bool | Save (const std::string &strDirectory, size_t iID, const CPCL &PCL, const CCoalesceMotifLibrary *pMotifs=NULL) const |
Saves cluster in a pair of PCL and motif text files in the given directory. | |
void | Save (std::ostream &ostm, size_t iID, const CPCL &PCL, const CCoalesceMotifLibrary *pMotifs=NULL, float dCutoffPWMs=0, float dPenaltyGap=0, float dPenaltyMismatch=0, bool fNoRCs=false) const |
Saves a textual representation of the cluster (including genes, conditions, and motifs) to the given output stream. | |
float | GetSimilarity (const CCoalesceCluster &Cluster, size_t iGenes, size_t iDatasets) const |
Calculates a similarity score between the given and current clusters. | |
void | Snapshot (const CCoalesceGeneScores &GeneScores, CCoalesceGroupHistograms &Histograms) |
Records the state of the cluster between convergence iterations. | |
bool | LabelMotifs (const CCoalesceMotifLibrary &Motifs, SMotifMatch::EType eMatchType, float dPenaltyGap, float dPenaltyMismatch, float dPValue) |
Labels all significant motifs in the cluster with any known TF motifs that match below the given threshhold. | |
bool | IsConverged () |
Returns true if the cluster has converged to a previously visited state. | |
bool | IsEmpty () const |
Returns true if the cluster contains no genes or no conditions. | |
void | SetGenes (size_t iGenes) |
Replaces the cluster's genes with the requested number of initial genes. | |
const std::set< size_t > & | GetGenes () const |
Returns the set of genes significant in this cluster. | |
const std::set< size_t > & | GetDatasets () const |
Returns the set of datasets significant in this cluster. | |
const std::set< SMotifMatch > & | GetMotifs () const |
Returns the set of motifs significant in this cluster. | |
bool | IsGene (size_t iGene) const |
Returns true if the given gene is significant in this cluster. | |
bool | IsDataset (size_t iDataset) const |
Returns true if the given dataset is significant in this cluster. | |
void | RemoveGenes (const std::vector< size_t > &veciGenes) |
Removes the given genes from the cluster. |
Manages a single converging regulatory module for CCoalesce.
A COALESCE cluster represents a regulatory module consisting of genes, conditions where they're coexpressed, and putative regulatory motifs. The cluster object is responsible for both the data necessary to represent such a module (read and written using Open and Save) and the methods used to generate the module (primarily Initialize, SelectConditions, SelectMotifs, and SelecGenes). A variety of optimizations make the algorithm both more efficient and more complex, including discretization of various scores and change tracking between iterations.
Definition at line 48 of file coalescecluster.h.
void Sleipnir::CCoalesceCluster::CalculateHistograms | ( | const CCoalesceGeneScores & | GeneScores, |
CCoalesceGroupHistograms & | HistogramsCluster, | ||
CCoalesceGroupHistograms * | pHistogramsPot | ||
) | const |
Updates the given motif score histograms based on the genes in the current cluster using the provided per-gene motif scores.
GeneScores | Per-gene motif scores from which score histograms are calculated. |
HistogramsCluster | Motif score histograms based on genes in the current cluster. |
pHistogramsPot | If non-null, motif score histograms based on genes not in the current cluster. |
COALESCE finds significant motifs for each cluster by determining which motifs have statistically different distributions in the cluster versus the genomic background. To do this, one histogram of per-gene frequencies is constructed per motif; these histograms are in turn based on the frequencies with which each motif appears in each gene. For example, suppose we have three genes G1
through G3
and three motifs M1
through M3
. Based each gene's sequence, we determine that the motifs have the following frequencies:
Motif G1 G2 G3 M1 1 1 2 M2 2 0 2 M3 1 2 0
If only genes G1
and G2
are in the cluster, we build a frequency histogram as follows:
Motif 0 1 2 M1 0 2 0 M2 1 0 1 M3 0 1 1
That is, in the cluster, there are zero genes within which M1
occurs zero times, two in which it occurs once, zero in which it occurs twice, and so forth. Each row (i.e. each motif's total histogram) must sum to the number of genes in the cluster, and the resulting inverse histograms for genes not in the cluster (i.e. G3
) is:
Motif 0 1 2 M1 0 0 1 M2 0 0 1 M3 1 0 0
In COALESCE, additional complexity arises since motif frequencies are continuous (and must thus be discretized in the histograms) and are calculated on a per-subsequence-type basis (giving rise to multiple histograms per cluster).
Definition at line 266 of file coalescecluster.cpp.
References GetGenes().
Referenced by Sleipnir::CCoalesce::Cluster().
const std::set<size_t>& Sleipnir::CCoalesceCluster::GetDatasets | ( | ) | const [inline] |
Returns the set of datasets significant in this cluster.
Definition at line 143 of file coalescecluster.h.
Referenced by Sleipnir::CCoalesce::Cluster().
const std::set<size_t>& Sleipnir::CCoalesceCluster::GetGenes | ( | ) | const [inline] |
Returns the set of genes significant in this cluster.
Reimplemented from Sleipnir::CCoalesceClusterImpl.
Definition at line 132 of file coalescecluster.h.
Referenced by CalculateHistograms(), Sleipnir::CCoalesce::Cluster(), GetSimilarity(), Save(), SelectConditions(), SelectGenes(), Snapshot(), and Subtract().
const std::set<SMotifMatch>& Sleipnir::CCoalesceCluster::GetMotifs | ( | ) | const [inline] |
Returns the set of motifs significant in this cluster.
Definition at line 154 of file coalescecluster.h.
Referenced by Sleipnir::CCoalesce::Cluster(), and Save().
float Sleipnir::CCoalesceCluster::GetSimilarity | ( | const CCoalesceCluster & | Cluster, |
size_t | iGenes, | ||
size_t | iDatasets | ||
) | const |
Calculates a similarity score between the given and current clusters.
Cluster | Cluster to be compared to the current cluster. |
iGenes | Total number of available genes. |
iDatasets | Total number of available datasets. |
Definition at line 1517 of file coalescecluster.cpp.
References GetGenes(), and IsGene().
bool Sleipnir::CCoalesceCluster::Initialize | ( | const CPCL & | PCL, |
CCoalesceCluster & | Pot, | ||
const std::vector< SCoalesceDataset > & | vecsDatasets, | ||
std::set< std::pair< size_t, size_t > > & | setpriiSeeds, | ||
const std::vector< float > & | vecdSeed, | ||
size_t | iPairs, | ||
float | dPValue, | ||
float | dProbability, | ||
size_t | iThreads | ||
) |
Randomly initializes a new cluster from the given PCL by selecting a pair of correlated genes and a surrounding seed of additional genes.
PCL | Expression data from which cluster is seeded. |
Pot | Cluster initialized to the inverse of the current cluster, i.e. all genes not in the new cluster. |
vecsDatasets | Vector of dataset block structure to be used by the new cluster for subsequent selection of significant conditions. |
setpriiSeeds | Set of previously failed seed pairs to be excluded for cluster initialization. |
vecdSeed | If non-empty, expression vector to be used as a cluster seed. |
iPairs | Maximum number of gene pairs to be sampled for seed pair discovery. |
dPValue | P-value threshhold for significant correlation. |
dProbability | Prior probability of gene inclusion. |
iThreads | Maximum number of simultaneous threads for gene pair sampling. |
<exception | class> Description of criteria for throwing this exception. |
A new cluster is initialized by selecting the most highly correlated gene pair from a random sample of the input PCL that is below the significance threshhold. The expression centroid of this pair is then calculated, and all other genes significantly correlated with this centroid are subsequently added. Genes not selected for inclusion are added to the inverse cluster instead. All significance tests are appropriately Bonferroni corrected for multiple hypothesis testing.
Definition at line 89 of file coalescecluster.cpp.
References Sleipnir::CPCL::ENormalizeColumn, Sleipnir::CPCL::GetGenes(), Sleipnir::CPCL::Normalize(), and Sleipnir::CPCL::Open().
Referenced by Sleipnir::CCoalesce::Cluster().
bool Sleipnir::CCoalesceCluster::IsConverged | ( | ) | [inline] |
Returns true if the cluster has converged to a previously visited state.
Definition at line 93 of file coalescecluster.h.
Referenced by Sleipnir::CCoalesce::Cluster().
bool Sleipnir::CCoalesceCluster::IsDataset | ( | size_t | iDataset | ) | const [inline] |
Returns true if the given dataset is significant in this cluster.
iDataset | Dataset index to be tested for inclusion. |
Definition at line 182 of file coalescecluster.h.
bool Sleipnir::CCoalesceCluster::IsEmpty | ( | ) | const [inline] |
Returns true if the cluster contains no genes or no conditions.
Definition at line 104 of file coalescecluster.h.
Referenced by Sleipnir::CCoalesce::Cluster().
bool Sleipnir::CCoalesceCluster::IsGene | ( | size_t | iGene | ) | const [inline] |
Returns true if the given gene is significant in this cluster.
iGene | Gene index to be tested for inclusion. |
Reimplemented from Sleipnir::CCoalesceClusterImpl.
Definition at line 168 of file coalescecluster.h.
Referenced by GetSimilarity(), and Save().
bool Sleipnir::CCoalesceCluster::LabelMotifs | ( | const CCoalesceMotifLibrary & | Motifs, |
SMotifMatch::EType | eMatchType, | ||
float | dPenaltyGap, | ||
float | dPenaltyMismatch, | ||
float | dPValue | ||
) |
Labels all significant motifs in the cluster with any known TF motifs that match below the given threshhold.
Motifs | Motif library to be used for cluster and known motifs. |
eMatchType | Type of match to be performed: correlation, rmse, etc. |
dPenaltyGap | Alignment score penalty for gaps. |
dPenaltyMismatch | Alignment score penalty for mismatches. |
dPValue | P-value (or other score) threshhold below which known TFs must match. |
Labels any significant motifs in the cluster with zero or more matching known TF binding motifs. These known labels are scored and will be read and written with subsequence Open/Save calls that include this cluster's motifs.
Definition at line 1596 of file coalescecluster.cpp.
References Sleipnir::CCoalesceMotifLibrary::GetKnowns().
size_t Sleipnir::CCoalesceCluster::Open | ( | const std::string & | strPCL, |
size_t | iSkip, | ||
const CPCL & | PCL, | ||
CCoalesceMotifLibrary * | pMotifs = NULL |
||
) |
Opens a cluster based on the given PCL file and, if present, accompanying motif file.
strPCL | Description of parameter strPCL. |
iSkip | Number of feature columns to skip between the gene IDs and first experimental column. |
PCL | PCL with which gene and condition labels are associated. |
pMotifs | If non-null, motif library with which cluster motifs are created. |
*
. For a given PCL file <filename>
.pcl, motifs are read from <filename>_motifs.txt
.Definition at line 1201 of file coalescecluster.cpp.
References Sleipnir::CPCL::GetExperiment(), Sleipnir::CPCL::GetExperiments(), Sleipnir::CPCL::GetExtension(), Sleipnir::CPCL::GetGene(), Sleipnir::CPCL::GetGenes(), and Sleipnir::CPCL::Open().
size_t Sleipnir::CCoalesceCluster::Open | ( | std::istream & | istm, |
const CPCL & | PCL, | ||
CCoalesceMotifLibrary * | pMotifs = NULL |
||
) |
Opens a cluster based on the textual representation (including genes, conditions, and motifs) in the given input stream.
istm | Input stream from which cluster is read. |
PCL | PCL with which gene and condition labels are associated. |
pMotifs | If non-null, motif library with which cluster motifs are created. |
Definition at line 1268 of file coalescecluster.cpp.
References Sleipnir::CPCL::GetExperiment(), Sleipnir::CPCL::GetGene(), Sleipnir::CCoalesceMotifLibrary::Open(), and Sleipnir::CMeta::Tokenize().
bool Sleipnir::CCoalesceCluster::Open | ( | const CHierarchy & | Hierarchy, |
const std::vector< CCoalesceCluster > & | vecClusters, | ||
const std::vector< std::string > & | vecstrClusters, | ||
float | dFraction, | ||
float | dCutoff, | ||
size_t | iCutoff, | ||
CCoalesceMotifLibrary * | pMotifs = NULL |
||
) |
Creates a new cluster by merging the one or more clusters in the given hierarchy.
Hierarchy | Hierarchy of cluster indices to be merged into the newly opened cluster. |
vecClusters | Vector of preexisting clusters, a superset of those in the given hierarchy. |
vecstrClusters | Vector of preexisting cluster IDs. |
dFraction | Minimum fraction of input cluster in which a gene must appear to be included in the output cluster. |
dCutoff | Edit distance threshhold beyond which merged motif alignments will be discarded. |
iCutoff | Maximum number of input motifs which will be merged exactly; larger numbers of motifs will be merged heuristically. |
pMotifs | If non-null, motif library by which input and output motifs are managed. |
Definition at line 1347 of file coalescecluster.cpp.
void Sleipnir::CCoalesceCluster::RemoveGenes | ( | const std::vector< size_t > & | veciGenes | ) | [inline] |
Removes the given genes from the cluster.
veciGenes | Vector of gene IDs to be removed. |
Definition at line 193 of file coalescecluster.h.
bool Sleipnir::CCoalesceCluster::Save | ( | const std::string & | strDirectory, |
size_t | iID, | ||
const CPCL & | PCL, | ||
const CCoalesceMotifLibrary * | pMotifs = NULL |
||
) | const |
Saves cluster in a pair of PCL and motif text files in the given directory.
strDirectory | Directory in which cluster files are generated. |
iID | Integer ID assigned to the cluster. |
PCL | PCL from which gene and condition labels are read. |
pMotifs | If non-null, motif library with which cluster motifs are formatted. |
Saves the cluster as a pair of text files (PCL and motifs) in the given directory. Filenames are randomly generated in the format c<id>_<rand>.pcl
and c<id>_<rand>_motifs.txt
. The PCL file will contain only genes included in the bicluster along with all conditions; conditions in the bicluster will be sorted to the left and tagged with an initial *
. Motif files are of the format:
<type1> <subtype1> <score1> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchx> <pst1> <pwm1> <type2> <subtype2> <score2> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchx> <pst2> <pwm2> ... <typez> <subtypez> <scorez> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchx> <pstz> <pwmz>
Definition at line 1033 of file coalescecluster.cpp.
References Sleipnir::CPCL::GetExperiment(), Sleipnir::CPCL::GetExperiments(), Sleipnir::CPCL::GetExtension(), Sleipnir::CPCL::GetGenes(), IsGene(), Sleipnir::CPCL::MaskGene(), Sleipnir::CPCL::Open(), Sleipnir::CPCL::Save(), and Sleipnir::CPCL::SetExperiment().
Referenced by Sleipnir::CCoalesce::Cluster().
void Sleipnir::CCoalesceCluster::Save | ( | std::ostream & | ostm, |
size_t | iID, | ||
const CPCL & | PCL, | ||
const CCoalesceMotifLibrary * | pMotifs = NULL , |
||
float | dCutoffPWMs = 0 , |
||
float | dPenaltyGap = 0 , |
||
float | dPenaltyMismatch = 0 , |
||
bool | fNoRCs = false |
||
) | const |
Saves a textual representation of the cluster (including genes, conditions, and motifs) to the given output stream.
ostm | Output stream to which cluster description is saved. |
iID | Integer ID assigned to the cluster. |
PCL | PCL from which gene and condition labels are read. |
pMotifs | If non-null, motif library with which cluster motifs are formatted. |
dCutoffPWMs | Minimum information threshhold (in bits) for a PWM to be saved. |
dPenaltyGap | Alignment score penalty for gaps. |
dPenaltyMismatch | Alignment score penalty for mismatches. |
fNoRCs | If true, resolve the given motif into a single strand without reverse complements before saving PWM. |
Saves cluster in the following tab-delimited textual format:
Cluster <id> Genes <gene1> <gene2> ... <genew> Conditions <cond1> <cond2> ... <condx> Motifs <type1> <subtype1> <score1> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchy> <pst1> <pwm1> <type2> <subtype2> <score2> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchy> <pst2> <pwm2> ... <typez> <subtypez> <scorez> <tf1>:<match1>|<tf2>:<match2>|...|<tfy>:<matchy> <pstz> <pwmz>
Definition at line 1151 of file coalescecluster.cpp.
References Sleipnir::CPCL::GetExperiment(), Sleipnir::CPCL::GetGene(), GetGenes(), and GetMotifs().
bool Sleipnir::CCoalesceCluster::SelectConditions | ( | const CPCL & | PCL, |
const CCoalesceCluster & | Pot, | ||
size_t | iThreads, | ||
float | dPValue, | ||
float | dZScore | ||
) |
Performs feature selection to include significant expression conditions in a converging cluster.
PCL | Expression dataset from which significant datasets are selected. |
Pot | Inverse of current cluster used for genomic background calculations. |
iThreads | Maximum number of simultaneous threads for condition significance calculations. |
dPValue | P-value threshhold for condition significance. |
dZScore | Z-score effect size threshhold for condition significance. |
Selects zero or more conditions in which the cluster's current gene set is differentially expressed. That is, in each selected condition, the average expression of genes in the cluster must differ from the genomic background with at least the given significance and effect size threshholds. If dataset blocks were given at cluster initialization time, all conditions in the block are added (or not) simultaneously using a multivariate significance test.
Definition at line 489 of file coalescecluster.cpp.
References GetGenes().
Referenced by Sleipnir::CCoalesce::Cluster().
bool Sleipnir::CCoalesceCluster::SelectGenes | ( | const CPCL & | PCL, |
const CCoalesceGeneScores & | GeneScores, | ||
const CCoalesceGroupHistograms & | HistsCluster, | ||
const CCoalesceGroupHistograms & | HistsPot, | ||
size_t | iMinimum, | ||
size_t | iThreads, | ||
CCoalesceCluster & | Pot, | ||
float | dProbability, | ||
const CCoalesceMotifLibrary * | pMotifs = NULL |
||
) |
Add and remove (im)probable genes to a converging cluster using Bayesian integration based on the currently selected expression conditions and sequence motifs.
PCL | PCL from which genes are selected. |
GeneScores | Per-gene motif scores used to determine each gene's probability based on motif frequencies. |
HistsCluster | Precalculated histogram of motif frequencies using genes in the cluster. |
HistsPot | Precalculated histogram of motif frequencies using genes not in the cluster. |
iMinimum | Minimum number of genes for which data must be present for a given motif for it to contribute to gene probabilities. |
iThreads | Maximum number of simultaneous threads for gene probability calculations. |
Pot | Inverse of genes in the cluster; used to determine genomic background probabilities. |
dProbability | Probability threshhold above which genes are included in the cluster. |
pMotifs | If non-null, motif library managing significant sequence motifs. |
Adds and removes genes to/from a converging cluster based on the currently selected significant expression conditions and sequence motifs. The probability of gene membership in the cluster given some data, P(g in C|D)
, is calculated using Bayes rule as P(D|g in C)P(g in C)/(P(D|g in C) + P(D|g notin C))
. All data features are assumed to be independent, such that P(D|g in C)
is a product over each significant condition/motif of A) the probability of the gene's expression value being drawn from the cluster's distribution of expression values or B) the probability of its frequency for some motif being drawn from the cluster's distribution of frequencies for that motif. Distributions are assumed to be normal, and frequencies are Laplace smoothed to avoid zeros. The prior P(g in C)
is used to stabilize cluster convergence, such that it is equal to 1 if a gene was already included in the cluster in a previous iteration and equal to dProbability if the gene was not previously included.
Definition at line 755 of file coalescecluster.cpp.
References GetGenes(), and Sleipnir::CPCL::GetGenes().
Referenced by Sleipnir::CCoalesce::Cluster().
bool Sleipnir::CCoalesceCluster::SelectMotifs | ( | const CCoalesceGroupHistograms & | HistsCluster, |
const CCoalesceGroupHistograms & | HistsPot, | ||
float | dPValue, | ||
float | dZScore, | ||
size_t | iMaxMotifs, | ||
size_t | iThreads, | ||
const CCoalesceMotifLibrary * | pMotifs = NULL |
||
) |
Performs feature selection to include significant sequence motifs in a converging cluster.
HistsCluster | Precalculated histogram of motif frequencies using genes in the cluster. |
HistsPot | Precalculated histogram of motif frequencies using genes not in the cluster. |
dPValue | P-value threshhold for motif significance. |
dZScore | Z-score effect size threshhold for motif significance. |
iMaxMotifs | Maximum number of motifs associated with a cluster; if more motifs are present, no new selection is performed. |
iThreads | Maximum number of simultaneous threads for motif significance calculations. |
pMotifs | If non-null, motif library with which motifs are managed. |
Selects zero or more motifs differentially over- or under-enriched in the genes currently in a converging cluster. Motifs are selected on a per-sequence-subtype basis, so a motif may be enriched, for example, in an upstream but not downstream flank. All motifs managed by the given library are tested for significance, including all k-mers, reverse complements, and any currently constructed PSTs. Significance testing is performed by z-scoring the within versus without frequency histograms, and all p-values are Bonferroni corrected.
Definition at line 588 of file coalescecluster.cpp.
Referenced by Sleipnir::CCoalesce::Cluster().
void Sleipnir::CCoalesceCluster::SetGenes | ( | size_t | iGenes | ) | [inline] |
Replaces the cluster's genes with the requested number of initial genes.
iGenes | Number of genes to be included in the cluster. |
Definition at line 118 of file coalescecluster.h.
Referenced by Sleipnir::CCoalesce::Cluster().
void Sleipnir::CCoalesceCluster::Snapshot | ( | const CCoalesceGeneScores & | GeneScores, |
CCoalesceGroupHistograms & | Histograms | ||
) |
Records the state of the cluster between convergence iterations.
GeneScores | Per-gene sequence scores used to snapshot motif histograms. |
Histograms | Motif histograms to be updated using the current gene scores. |
Definition at line 1553 of file coalescecluster.cpp.
References GetGenes().
Referenced by Sleipnir::CCoalesce::Cluster().
void Sleipnir::CCoalesceCluster::Subtract | ( | CPCL & | PCL, |
const CCoalesceCluster & | Pot | ||
) | const |
Subtract the average expression value for each condition in the cluster from each gene in the cluster.
PCL | Expression matrix from which cluster averages are subtracted. |
Pot | Inverse of genes in the cluster; used to determine the difference of the cluster's average from the existing per-condition average. |
Definition at line 301 of file coalescecluster.cpp.
References Sleipnir::CPCL::Get(), GetGenes(), and Sleipnir::CMeta::IsNaN().
Referenced by Sleipnir::CCoalesce::Cluster().
void Sleipnir::CCoalesceCluster::Subtract | ( | CCoalesceGeneScores & | GeneScores | ) | const |
Subtract the average frequency of each motif in the cluster from the score for that motif in each gene in the cluster.
GeneScores | Per-gene motif frequency scores from which the cluster averages are subtracted. |
Definition at line 330 of file coalescecluster.cpp.
References GetGenes().