Sleipnir
Public Member Functions
Sleipnir::CHierarchy Class Reference

Represents a simple node in a binary tree. More...

#include <clusthierarchical.h>

Inheritance diagram for Sleipnir::CHierarchy:
Sleipnir::CHierarchyImpl

Public Member Functions

 CHierarchy (size_t iID, float dSimilarity, const CHierarchy *pLeft, const CHierarchy *pRight)
 Constructs a new hierarchy node.
void GetGenes (std::vector< size_t > &veciGenes) const
 Retrieve the IDs of the hierarchy's genes by inorder traversal.
float SortChildren (const std::vector< float > &vecdScores)
 Performs node-flipping in the hierarchy according to a given set of leaf node scores.
void Save (std::ostream &ostm, size_t iGenes, const std::vector< std::string > *pvecstrGenes=NULL) const
 Save the hierarchy to the given stream in GTR format.
void Destroy ()
 Safety method to delete a hierarchy.
float GetSimilarity () const
 Returns this node's height within the hierarchy.
bool IsGene () const
 Returns true if the current node is a leaf node (i.e. represents a gene in the hierarchy).
size_t GetID () const
 Returns the current node's unique ID within the hierarchy.
const CHierarchyGet (bool fRight) const
 Returns the current node's left or right child.
size_t GetWeight () const
 Returns the number of leaves under the current node.

Detailed Description

Represents a simple node in a binary tree.

Generated by CClustHierarchical::Cluster, a CHierarchy is an extremely rudimentary representation of an binary tree intended to be serialized to disk as the GTR file in a CDT/GTR pair. Each node either zero or two children, a unique integer identifier within the tree, and a similarity score indicating its height within the tree.

Definition at line 41 of file clusthierarchical.h.


Constructor & Destructor Documentation

Sleipnir::CHierarchy::CHierarchy ( size_t  iID,
float  dSimilarity,
const CHierarchy pLeft,
const CHierarchy pRight 
)

Constructs a new hierarchy node.

Parameters:
iIDUnique ID of the new node.
dSimilarityHeight of the new node within the tree.
pLeftFirst child of the new node, possibly null.
pRightSecond child of the new node, possibly null.
Remarks:
If either of pLeft or pRight is null, they should both be null (i.e. a node should have exactly zero or exactly two children).
See also:
CClustHierarchical::Cluster

Definition at line 52 of file clusthierarchical.cpp.


Member Function Documentation

void Sleipnir::CHierarchy::Destroy ( ) [inline]

Safety method to delete a hierarchy.

Remarks:
Included to avoid the necessity of directly deleting something allocated within a library method.

Definition at line 80 of file clusthierarchical.h.

const CHierarchy& Sleipnir::CHierarchy::Get ( bool  fRight) const [inline]

Returns the current node's left or right child.

Parameters:
fRightIf true, return the right (second) child; otherwise, return the left (first).
Returns:
One of the current node's two children.
Remarks:
Do not call for leaf nodes.
See also:
IsLeaf

Definition at line 137 of file clusthierarchical.h.

void Sleipnir::CHierarchy::GetGenes ( std::vector< size_t > &  veciGenes) const

Retrieve the IDs of the hierarchy's genes by inorder traversal.

Parameters:
veciGenesOutput vector into which gene IDs are placed.
Remarks:
Genes are leaf nodes with IDs generally corresponding to their original PCL indices before clustering. This method will return the PCL indices as they are currently ordered in the hierarchy.
See also:
SortChildren

Definition at line 111 of file clusthierarchical.cpp.

References IsGene().

size_t Sleipnir::CHierarchy::GetID ( ) const [inline]

Returns the current node's unique ID within the hierarchy.

Returns:
The current node's ID.
Remarks:
Leaf node IDs generally correspond to gene indices within the pre-clustered PCL; internal node IDs are arbitrary unique values.

Definition at line 117 of file clusthierarchical.h.

float Sleipnir::CHierarchy::GetSimilarity ( ) const [inline]

Returns this node's height within the hierarchy.

Returns:
The current node's height within the hierarchy.

Definition at line 91 of file clusthierarchical.h.

size_t Sleipnir::CHierarchy::GetWeight ( ) const [inline]

Returns the number of leaves under the current node.

Returns:
Number of leaves under the current node.

Definition at line 148 of file clusthierarchical.h.

bool Sleipnir::CHierarchy::IsGene ( ) const [inline]

Returns true if the current node is a leaf node (i.e. represents a gene in the hierarchy).

Returns:
True if the current node is a leaf (has no children).

Reimplemented from Sleipnir::CHierarchyImpl.

Definition at line 102 of file clusthierarchical.h.

Referenced by GetGenes(), and SortChildren().

void Sleipnir::CHierarchy::Save ( std::ostream &  ostm,
size_t  iGenes,
const std::vector< std::string > *  pvecstrGenes = NULL 
) const [inline]

Save the hierarchy to the given stream in GTR format.

Parameters:
ostmOutput stream into which the hierarchy is saved.
iGenesTotal number of leaf nodes in the hierarchy.
pvecstrGenesIf non-NULL, vector of gene names to be emitted in place of GENE IDs.
Remarks:
iGenes can be calculated from the hierarchy; it is included as an input solely for convenience purposes, since the genes must be output in original order (not traversal order) to satisfy GTR file formatting requirements.

Reimplemented from Sleipnir::CHierarchyImpl.

Definition at line 66 of file clusthierarchical.h.

float Sleipnir::CHierarchy::SortChildren ( const std::vector< float > &  vecdScores)

Performs node-flipping in the hierarchy according to a given set of leaf node scores.

Parameters:
vecdScoresScores for the hierarchy's leaf nodes indexed by ID.
Returns:
Score of the current node.

SortChildren can be used to node-flip a hierarchy such that an inorder traversal of the leaf nodes results in a strictly increasing value for some precomputed score. Since optimal node ordering is NP-hard, this is often used to heuristically order microarray vectors, e.g. from most green to most red.

Remarks:
vecdScores must be of size equal to the total number of leaves in the hierarchy, and elements of the vector are indexed by the IDs of the leaf nodes (generally the original index of the leaf genes within the pre-clustered PCL file).

Definition at line 138 of file clusthierarchical.cpp.

References IsGene().


The documentation for this class was generated from the following files: