Sleipnir
Combiner

Combiner combines things! Where things include combining PCLs into one large PCL, DAT/DABs into one averaged DAT/DAB, or DAT/DABs into a DAD dataset file. Multiple PCLs are combined into a single wide PCL by aligning genes' expression vectors and inserting missing values for genes not present in some dataset(s). DAT/DABs are combined into a single DAT/DAB by averaging pairwise scores in a configurable manner. DAT/DABs are combined into a DAD by ordering their individual scores as detailed in Sleipnir::CDatasetCompact::Save.

Usage

Basic Usage

 Combiner -t pcl -o <combined.pcl> <data.pcl>*

Create a new PCL file combined.pcl containing all genes in the microarray PCL files data.pcl, with new expression vectors consisting of the concatenation of all data from these input files. In other words, take the input PCLs, line up each gene's values, smoosh them all together, and plop the result into the output file.

 Combiner -t dat -o <combined.dab> -n <data.dab>*

Create a new DAT/DAB file combined.dab in which each gene pair's score is the average of the normalized (by z-scoring) scores from all input DAT/DAB files data.dab. The combination method can be modified using -m.

 Combiner -t dad -o <combined.dad> <data.dab>*

Create a new DAD file combined.dad containing the discretized gene pair scores from all input DAT/DAB files data.dab, which must be accompanied by appropriate QUANT files. This is equivalent to Dab2Dad.

Detailed Usage

package "Combiner"
version "1.0"
purpose "PCL and data file combination tool"

section "Main"
option  "type"      t   "Type of combination to perform: pcl combines PCLs into a PCL, dat combines DAT/DABs into a DAT/DAB, and dad concatenates DAT/DABs into a DAD (This is equivalent to Dab2Dad)."
                        values="pcl","dat","dad","module","revdat"  default="pcl"
option  "method"    m   "Combination method"
                        values="min","max","mean","gmean","hmean","sum","diff","meta","qmeta"   default="mean"
option  "output"    o   "Output file"
                        string  typestr="filename"
option  "weights"   w   "Weights file"
                        string  typestr="filename"

section "Modules"
option  "jaccard"   j   "Minimum Jaccard index for module equivalence"
                        float   default="0.5"
option  "intersection"  r   "Minimum intersection fraction for module inheritance"
                        double  default="0.666"

section "Filtering"
option  "genes"     g   "Process only genes from the given set"
                        string  typestr="filename"
option  "terms"     e   "Produce DAT/DABs averaging within the provided terms"
                        string  typestr="filename"

section "Miscellaneous"
option  "reweight"  W   "Treat weights as absolute"
                        flag    off
option  "subset"    s   "Subset size (none if zero)"
                        int default="0"
option  "normalize" n   "Normalize inputs before combining"
                        flag    off
option  "quantiles" q   "Replace values with quantiles"
                        int default="0"
option  "zscore"    z   "Z-score output after combining (applies to dat combination type)"
                        flag    off
option  "zero"      Z   "Default missing values to zero"
                        flag    off

section "Optional"
option  "skip"      k   "Columns to skip in input PCLs"
                        int default="2"
option  "memmap"    p   "Memory map input files"
                        flag    off
option  "verbosity" v   "Message verbosity"
                        int default="5"
option  "directory"     d       "input directory (must only contain input files)"
                                                string  typestr="directory"
Flag Default Type Description
None None PCL or DAT/DAB files Input files to be combined; must all be of an appropriate type for the requested output.
-t pcl pcl, dat, or dad Type of combination to perform: pcl combines PCLs into a PCL, dat combines DAT/DABs into a DAT/DAB, and dad combines DAT/DABs into a DAD.
-m mean mean, min, max, gmean, or hmean Type of DAT/DAB combination to perform when combining pairwise scores. Options are to calculate the arithmetic, geometric, or harmonic mean, or to retain only the minimum or maximum value for each gene pair.
-o stdout PCL, DAT/DAB, or DAD file Output file of the type specified by -t.
-k 2 Integer Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.
-p off Flag If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped.
-n off Flag If on, normalize input edges to the range [0,1] before processing.
-s 0 Integer If nonzero, process input DAT/DABs in subsets of the requested size as described in Sleipnir::CDataSubset.