Sleipnir
|
MCluster performs hierarchical clustering of a given microarray dataset based on one of a variety of similarity measures or on a precomputed gene pair similarity matrix (e.g. predicted functional relationships). Output files are compatible with visualization tools such as Java TreeView.
MCluster -i <data.pcl> -o <clustered.gtr> > <clustered.cdt>
Cluster the microarray data in data.pcl
using the default similarity measure (Pearson correlation); save the clustered expression data in clustered.cdt
and the gene tree in clustered.gtr
.
MCluster -i <data.dab> -o <clustered.gtr> < <data.pcl> > <clustered.cdt>
Cluster the microarray data in data.pcl
using the pre-calculated similarity scores (e.g. probabilities of functional relationship) in data.dab
; save the clustered expression data in clustered.cdt
and the gene tree in clustered.gtr
.
package "MCluster"
version "1.0"
purpose "Hierarchical clustering tool."
section "Main"
option "input" i "Input PCL/DAT/DAB file"
string typestr="filename"
option "output" o "Output GTR file"
string typestr="filename" yes
option "distance" d "Similarity measure"
values="pearson","euclidean","kendalls","kolm-smir","spearman"
default="pearson"
section "Miscellaneous"
option "weights" w "Input weights file"
string typestr="filename"
section "Preprocessing"
option "normalize" n "Normalize similarities before clustering"
flag on
option "flip" f "Invert similarities before clustering"
flag off
option "epsilon" e "Remove genes with no correlations above epsilon"
double
option "power" p "Power transform similarities"
double default="1"
option "skip" s "Columns to skip in input PCL"
int default="2"
option "verbosity" v "Message verbosity"
int default="5"
Flag | Default | Type | Description |
---|---|---|---|
-i | stdin | PCL or DAT/DAB file | Input PCL microarray data or DAT/DAB pairwise similarity data. If a DAT/DAB is given here, a PCL file should be provided on standard input. |
-o | None | GTR text file | Output GTR file for use with visualization tools. Output CDT is printed to standard output. |
-d | pearson | pearson, euclidean, kendalls, kolm-smir, or spearman | Similarity measure to be used for clustering. |
-w | None | PCL text file | If given, a PCL file with dimensions equal to the data given with -i . However, the values in the cells of the weights PCL represent the relative weight given to each gene/experiment pair. If no weights file is given, all weights default to 1. |
-n | off | Flag | If on, normalize input edges to the range [0,1] before processing. |
-f | off | Flag | If on, output one minus the input's values. |
-e | None | Double | If given, remove all input edges below the given cutoff (after optional normalization). |
-p | 1 | Double | Raise all input similarity scores to the given power. |
-s | 2 | Integer | Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL. |