Sleipnir
|
Combiner combines things! Where things include combining PCLs into one large PCL, DAT/DABs into one averaged DAT/DAB, or DAT/DABs into a DAD dataset file. Multiple PCLs are combined into a single wide PCL by aligning genes' expression vectors and inserting missing values for genes not present in some dataset(s). DAT/DABs are combined into a single DAT/DAB by averaging pairwise scores in a configurable manner. DAT/DABs are combined into a DAD by ordering their individual scores as detailed in Sleipnir::CDatasetCompact::Save.
Combiner -t pcl -o <combined.pcl> <data.pcl>*
Create a new PCL file combined.pcl
containing all genes in the microarray PCL files data.pcl
, with new expression vectors consisting of the concatenation of all data from these input files. In other words, take the input PCLs, line up each gene's values, smoosh them all together, and plop the result into the output file.
Combiner -t dat -o <combined.dab> -n <data.dab>*
Create a new DAT/DAB file combined.dab
in which each gene pair's score is the average of the normalized (by z-scoring) scores from all input DAT/DAB files data.dab
. The combination method can be modified using -m
.
Combiner -t dad -o <combined.dad> <data.dab>*
Create a new DAD file combined.dad
containing the discretized gene pair scores from all input DAT/DAB files data.dab
, which must be accompanied by appropriate QUANT files. This is equivalent to Dab2Dad.
package "Combiner"
version "1.0"
purpose "PCL and data file combination tool"
section "Main"
option "type" t "Type of combination to perform: pcl combines PCLs into a PCL, dat combines DAT/DABs into a DAT/DAB, and dad concatenates DAT/DABs into a DAD (This is equivalent to Dab2Dad)."
values="pcl","dat","dad","module","revdat" default="pcl"
option "method" m "Combination method"
values="min","max","mean","gmean","hmean","sum","diff","meta","qmeta" default="mean"
option "output" o "Output file"
string typestr="filename"
option "weights" w "Weights file"
string typestr="filename"
section "Modules"
option "jaccard" j "Minimum Jaccard index for module equivalence"
float default="0.5"
option "intersection" r "Minimum intersection fraction for module inheritance"
double default="0.666"
section "Filtering"
option "genes" g "Process only genes from the given set"
string typestr="filename"
option "terms" e "Produce DAT/DABs averaging within the provided terms"
string typestr="filename"
section "Miscellaneous"
option "reweight" W "Treat weights as absolute"
flag off
option "subset" s "Subset size (none if zero)"
int default="0"
option "normalize" n "Normalize inputs before combining"
flag off
option "quantiles" q "Replace values with quantiles"
int default="0"
option "zscore" z "Z-score output after combining (applies to dat combination type)"
flag off
option "zero" Z "Default missing values to zero"
flag off
section "Optional"
option "skip" k "Columns to skip in input PCLs"
int default="2"
option "memmap" p "Memory map input files"
flag off
option "verbosity" v "Message verbosity"
int default="5"
option "directory" d "input directory (must only contain input files)"
string typestr="directory"
Flag | Default | Type | Description |
---|---|---|---|
None | None | PCL or DAT/DAB files | Input files to be combined; must all be of an appropriate type for the requested output. |
-t | pcl | pcl, dat, or dad | Type of combination to perform: pcl combines PCLs into a PCL, dat combines DAT/DABs into a DAT/DAB, and dad combines DAT/DABs into a DAD. |
-m | mean | mean, min, max, gmean, or hmean | Type of DAT/DAB combination to perform when combining pairwise scores. Options are to calculate the arithmetic, geometric, or harmonic mean, or to retain only the minimum or maximum value for each gene pair. |
-o | stdout | PCL, DAT/DAB, or DAD file | Output file of the type specified by -t . |
-k | 2 | Integer | Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL. |
-p | off | Flag | If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped. |
-n | off | Flag | If on, normalize input edges to the range [0,1] before processing. |
-s | 0 | Integer | If nonzero, process input DAT/DABs in subsets of the requested size as described in Sleipnir::CDataSubset. |