Sleipnir: MCluster

MCluster performs hierarchical clustering of a given microarray dataset based on one of a variety of similarity measures or on a precomputed gene pair similarity matrix (e.g. predicted functional relationships). Output files are compatible with visualization tools such as Java TreeView.

Usage

Basic Usage

 MCluster -i <data.pcl> -o <clustered.gtr> > <clustered.cdt>

Cluster the microarray data in data.pcl using the default similarity measure (Pearson correlation); save the clustered expression data in clustered.cdt and the gene tree in clustered.gtr.

 MCluster -i <data.dab> -o <clustered.gtr> < <data.pcl> > <clustered.cdt>

Cluster the microarray data in data.pcl using the pre-calculated similarity scores (e.g. probabilities of functional relationship) in data.dab; save the clustered expression data in clustered.cdt and the gene tree in clustered.gtr.

Detailed Usage

package "MCluster"
version "1.0"
purpose "Hierarchical clustering tool."

section "Main"
option  "input"     i   "Input PCL/DAT/DAB file"
                        string  typestr="filename"
option  "output"    o   "Output GTR file"
                        string  typestr="filename"  yes
option  "distance"  d   "Similarity measure"
                        values="pearson","euclidean","kendalls","kolm-smir","spearman"
                        default="pearson"

section "Miscellaneous"
option  "weights"   w   "Input weights file"
                        string  typestr="filename"

section "Preprocessing"
option  "normalize" n   "Normalize similarities before clustering"
                        flag    on
option  "flip"      f   "Invert similarities before clustering"
                        flag    off
option  "epsilon"   e   "Remove genes with no correlations above epsilon"
                        double
option  "power"     p   "Power transform similarities"
                        double  default="1"

option  "skip"      s   "Columns to skip in input PCL"
                        int default="2"
option  "verbosity" v   "Message verbosity"
                        int default="5"

Flag	Default	Type	Description
-i	stdin	PCL or DAT/DAB file	Input PCL microarray data or DAT/DAB pairwise similarity data. If a DAT/DAB is given here, a PCL file should be provided on standard input.
-o	None	GTR text file	Output GTR file for use with visualization tools. Output CDT is printed to standard output.
-d	pearson	pearson, euclidean, kendalls, kolm-smir, or spearman	Similarity measure to be used for clustering.
-w	None	PCL text file	If given, a PCL file with dimensions equal to the data given with `-i`. However, the values in the cells of the weights PCL represent the relative weight given to each gene/experiment pair. If no weights file is given, all weights default to 1.
-n	off	Flag	If on, normalize input edges to the range [0,1] before processing.
-f	off	Flag	If on, output one minus the input's values.
-e	None	Double	If given, remove all input edges below the given cutoff (after optional normalization).
-p	1	Double	Raise all input similarity scores to the given power.
-s	2	Integer	Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.