Sleipnir
Clusters2Dab

Clusters2Dab converts the output of many common non-hierarchical clustering algorithms into pairwise scores based on the frequency or confidence of coclustering. For example, if two genes are scored by the number of times they cocluster over many random seeds, then high scores will be indicative of a stronger pairwise relationship than low scores.

Usage

Basic Usage

 Clusters2Dab -i <clusters.txt> -o <coclusters.dab>

Create a new DAT/DAB file coclusters.dab in which each gene pair is given a score of one if they cluster together in the hard clustering clusters.txt and a score of zero if they do not.

 Clusters2Dab -t samba -i <clusters.txt> -o <coclusters.dab>

Create a new DAT/DAB file coclusters.dab in which each gene pair is given a score equal to the confidence of their strongest shared SAMBA bicluster in clusters.txt if one exists or zero if it does not.

 Clusters2Dab -t param -i <clusters.txt> -o <coclusters.dab>

Create a new DAT/DAB file coclusters.dab in which each gene pair is given a score equal to the maximum parameter value in clusters.txt at which they cocluster.

Detailed Usage

package "Clusters2Dab"
version "1.0"
purpose "Generate pairwise scores from preclustered output."

section "Main"
option  "input"     i   "Input cluster file"
                        string  typestr="filename"
option  "output"    o   "Output DAT/DAB file"
                        string  typestr="filename"
option  "type"      t   "Type of input cluster"
                        values="samba","list","param","fuzzy"   default="list"

section "Optional"
option  "counts"    c   "Calculate pair weight by cocluster frequency"
                        flag    off
option  "size"      z   "Calculate pair weight by cluster size"
                        flag    off
option  "skip"      s   "Columns to skip in input PCL"
                        int default="2"
option  "verbosity" v   "Message verbosity"
                        int default="5"
Flag Default Type Description
-i stdin Text file Input cluster file in one of the text formats supported by -t.
-o stdout DAT/DAB file Output DAT/DAB file containing pairwise scores appropriate to the input cluster type.
-t list list, samba, param, or fuzzy Type of cluster file provided to -i. list assumes that each line contains a gene ID and a cluster index separated by a tab, samba reads biclustering output from the EXPANDER program by Sharan, Shamir, et al, param reads hard clustering output from EXPANDER, and fuzzy reads output from the Aerie fuzzy k-means program by Gasch, Eisen, et al.
-c off Flag If on, calculate pairwise scores solely by cocluster frequency (counts); otherwise, pairwise scores are weighted by cluster confidence.
-s 2 Integer Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.