Sleipnir
|
Utility class containing static k-means clustering methods. More...
#include <clustkmeans.h>
Static Public Member Functions | |
static bool | Cluster (const CDataMatrix &MatData, const IMeasure *pMeasure, size_t iK, std::vector< uint16_t > &vecsClusters, const CDataMatrix *pMatWeights=NULL) |
Cluster a set of elements into k groups using the given data and pairwise similarity score. | |
static bool | Cluster (const CDistanceMatrix &MatSimilarities, size_t iK, std::vector< uint16_t > &vecsClusters) |
Cluster a set of elements into k groups using the given pairwise similarities. |
Utility class containing static k-means clustering methods.
Definition at line 36 of file clustkmeans.h.
bool Sleipnir::CClustKMeans::Cluster | ( | const CDataMatrix & | MatData, |
const IMeasure * | pMeasure, | ||
size_t | iK, | ||
std::vector< uint16_t > & | vecsClusters, | ||
const CDataMatrix * | pMatWeights = NULL |
||
) | [static] |
Cluster a set of elements into k groups using the given data and pairwise similarity score.
MatData | Data vectors for each element, generally microarray values from a PCL file. |
pMeasure | Similarity measure to use for clustering. |
iK | Number of clusters to generate. |
vecsClusters | Output cluster IDs for each gene. |
pMatWeights | If non-null, weights to use for each gene/condition value. These can be used to up/downweight aneuploidies present under only certain conditions, for example. Default assumes all ones. |
Performs k-means clustering on the given data using the specified similarity measure and number of clusters. The indices of each element's final cluster are indicated in the output vector. If given, individual gene/condition scores can be weighted (e.g. to up/downweight aneuploidies present only under certain conditions). During k-means clustering, K centers are initially chosen at random. Each gene is assigned to the center most similar to it, and the centers are moved to the mean of their assigned genes. This process is iterated until no gene assignments change. This places each gene in exactly one cluster.
Definition at line 67 of file clustkmeans.cpp.
References Sleipnir::CFullMatrix< tType >::Clear(), Sleipnir::IMeasure::EMapNone, Sleipnir::CFullMatrix< tType >::Get(), Sleipnir::CFullMatrix< tType >::GetColumns(), Sleipnir::CFullMatrix< tType >::GetRows(), Sleipnir::CFullMatrix< tType >::Initialize(), Sleipnir::CMeta::IsNaN(), and Sleipnir::IMeasure::Measure().
bool Sleipnir::CClustKMeans::Cluster | ( | const CDistanceMatrix & | MatSimilarities, |
size_t | iK, | ||
std::vector< uint16_t > & | vecsClusters | ||
) | [static] |
Cluster a set of elements into k groups using the given pairwise similarities.
MatSimilarities | Matrix of precalculated pairwise similarities between elements to be clustered. |
iK | Number of clusters to generate. |
vecsClusters | Output cluster IDs for each gene. |
Performs k-means clustering on the given data using the specified similarites and number of clusters. The indices of each element's final cluster are indicated in the output vector. During k-means clustering, K centers are initially chosen at random. Each gene is assigned to the center most similar to it, and the centers are moved to the mean of their assigned genes. This process is iterated until no gene assignments change. This places each gene in exactly one cluster.
Definition at line 152 of file clustkmeans.cpp.
References Sleipnir::CFullMatrix< tType >::Clear(), Sleipnir::CFullMatrix< tType >::Get(), Sleipnir::CHalfMatrix< tType >::Get(), Sleipnir::CFullMatrix< tType >::GetColumns(), Sleipnir::CFullMatrix< tType >::GetRows(), Sleipnir::CHalfMatrix< tType >::GetSize(), Sleipnir::CFullMatrix< tType >::Initialize(), Sleipnir::CMeta::IsNaN(), and Sleipnir::CFullMatrix< tType >::Set().