Sleipnir
Dab2Dad

Dab2Dad combines multiple DAT/DAB files, possibly including a gold standard answer file in addition to multiple datasets, into a DAD file as described by Sleipnir::CDatasetCompact.

Usage

Basic Usage

 Dab2Dad -o <data.dad> <data.dab>*

Create an output dataset DAD file data.dad consisting of the gene pairs and values from all DAT/DAB files data.dab, which must be accompanied by corresponding QUANT files.

 Dab2Dad -o <data.dad> -n <network.xdsl> -a <answers.dab> <data.dab>*

Create an output dataset DAD file data.dad appropriate for use in training or evaluating Bayesian networks with the structure network.xdsl, using the gold standard answers in answers.dab and the data in the DAT/DAB files data.dab.

 Dab2Dad -i <data.dad>

Output (to standard output) the contents of data.dad as a tab-delimited text table.

 Dab2Dad -i <data.dad> -m -l <gene1> -L <gene2>

Open data.dad using memory mapping and output all values stored for the gene pair gene1 and gene2.

Detailed Usage

package "Dab2Dad"
version "1.0"
purpose "Single/multiple data file interconversion"

defgroup "Input-Output"
groupoption "input"     i   "Input DAD file"
                            string  typestr="filename"  group="Input-Output"
groupoption "load"      a   "Persistent load DAD file"
                            string  typestr="filename"  group="Input-Output"
groupoption "network"   n   "Input Bayesian network (X)DSL"
                            string  typestr="filename"  group="Input-Output"

section "Main"
option  "output"        o   "Output DAD file"
                            string  typestr="filename"
option  "answers"       w   "Answer DAT/DAB file"
                            string  typestr="filename"
option  "directory"     d   "Directory with DAB files"
                            string  typestr="directory" default="."

section "Miscellaneous"
option  "everything"    e   "Include pairs without answers"
                            flag    off
option  "continuous"    c   "Output continuous values"
                            flag    off

section "Learning/Evaluation"
option  "genes"         g   "Gene inclusion file"
                            string  typestr="filename"
option  "genex"         G   "Gene exclusion file"
                            string  typestr="filename"

section "Lookups"
option  "lookup1"       l   "First lookup gene"
                            string
option  "lookup2"       L   "Second lookup gene"
                            string
option  "lookups"       t   "Lookup gene set"
                            string  typestr="filename"
option  "lookupp"       T   "Lookup pair set"
                            string  typestr="filename"
option  "quantize"      q   "Discretize lookups"
                            flag    off
option  "paircount"     P   "Only count pairs above cutoff"
                            int default="-1"

section "Optional"
option  "mask"          k   "Mask DAT/DAB file"
                            string  typestr="filename"
option  "memmap"        m   "Memory map input/output"
                            flag    off
option  "verbosity"     v   "Message verbosity"
                            int default="5"
Flag Default Type Description
None None DAT/DAB files Input DAT/DAB files to be combined into a DAD.
-i None DAD file If given, load this binary DAD file and save it as text to standard output or to -o.
-a None DAD file If given, load this binary DAD file and retain a shared memory map (can theoretically be used to force a DAD into virtual memory for rapid access by other processes).
-n None (X)DSL file If given, combine DAT/DABs from -w and -d into a DAD file appropriate for use with the given Bayes net. Discretization and node order information will be read from the given network.
-o stdout DAD file If given, output DAD is written as binary to the requested file.
-w None DAT/DAB file Must be given with -n; indicates an answer file DAT/DAB to be included in the DAD along with datasets from the command line.
-d None Directory Must be given with -n; indicates the directory in which DAT/DAB files corresponding to the input network's node IDs will be found.
-e off Flag If on, output DAD will include data for all gene pairs; if off, it will only contain data for gene pairs with a value present in the answer file.
-g None Text gene list If given, use only gene pairs for which both genes are in the list. For details, see Sleipnir::CDat::FilterGenes.
-G None Text gene list If given, use only gene pairs for which neither gene is in the list. For details, see Sleipnir::CDat::FilterGenes.
-l None String If given, lookup all values for pairs involving the requested gene.
-L None String If given with -l, lookup all values for the requested gene pair.
-t None Gene text file If given with -l, lookup all pairs between -l and the given gene set. If given alone, lookup all pairs between genes in the given set.
-T None Gene pair text file Tab-delimited text file containing one pair of gene IDs per row. If given, lookup values for all requested pairs.
-q off Flag If on, quantize data from input DAT/DABs before outputting lookup results.
-k None DAT/DAB file If given, process only gene pairs present in the given DAT/DAB with a score greater than zero.
-m off Flag If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped.