Sleipnir
|
BNUnraveler is a multithreaded tool for performing inference on many context-specific Bayesian classifiers to produce predicted functional relationship networks. It is generally paired with BNWeaver to learn classifiers from data and then infer context-specific functional relationships.
BNUnraveler -i <contexts_dir> -d <data_dir> -o <networks_dir> [-t <threads>] <contexts.txt>*
For each biological context contexts.txt
, load a context-specific Bayesian classifier (of the same name) from contexts_dir
and use Bayesian inference with the data (named identically to the node IDs in the classifier) from data_dir
to create a context-specific functional relationship network in networks_dir
; optionally, use threads
parallel threads.
package "BNUnraveler"
version "1.0"
purpose "Bayes net evaluation from data"
section "Main"
option "input" i "Input (X)DSL file directory"
string typestr="directory" default="."
option "directory" d "Data directory"
string typestr="directory" default="."
option "output" o "Output directory"
string typestr="directory" default="."
section "Miscellaneous"
option "everything" e "Evaluate non-term pairs"
flag on
option "answers" w "Answer file"
string typestr="filename"
section "Learning/Evaluation"
option "genes" g "Gene inclusion file"
string typestr="filename"
option "genome" G "Gene list of interest"
string typestr="filename"
section "Network Features"
option "zero" z "Zero missing values"
flag off
option "zeros" Z "Read zeroed node IDs/outputs from the given file"
string typestr="filename"
section "Optional"
option "memmap" m "Memory map input files"
flag off
option "threads" t "Maximum number of threads to spawn"
int default="-1"
option "xdsl" x "Assume XDSL input rather than DSL"
flag on
option "group" u "Group identical inputs"
flag on
option "verbosity" v "Message verbosity"
int default="5"
Flag | Default | Type | Description |
---|---|---|---|
None | None | Gene text files | Gene sets representing biological contexts (sets of related genes) for which Bayesian classifiers have been learned. Must have filenames corresponding to the (X)DSL files to be loaded, e.g. if "mitotic_cell_cycle.txt" is given on the command line, "mitotic_cell_cycle.xdsl" must exist in -i . |
-i | . | Directory | Directory from which (X)DSL Baysian classifier files are read. Must be naive classifiers with identical structure (but presumably different parameters). |
-o | . | Directory | Directory into which inferred functional relationship networks (DAB files) are placed. |
-d | . | Directory | Directory from which data files are read. Must be DAT/DAB files with names identical to the nodes of the Bayesian classifiers. |
-e | on | Flag | If on, predict relationship probabilities for all genes with any data, regardless of context. |
-w | None | DAT/DAB file | If given, predict relationship probabilities only for gene pairs in the given answer file. |
-g | None | Text gene list | If given, predict relationship probabilities for all gene pairs for which both genes are in the list (regardless of context). |
-G | None | Text gene list | If given, predict relationship probabilities only for gene pairs for which both genes are in the list (in addition to context filtering). |
-z | off | Flag | If on, assume that all missing gene pairs in all datasets have a value of 0 (i.e. the first bin). |
-Z | None | Tab-delimited text file | If given, argument must be a tab-delimited text file containing two columns, the first node IDs (see BNCreator) and the second bin numbers (zero indexed). For each node ID present in this file, missing values will be substituted with the given bin number. |
-m | off | Flag | If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped. |
-t | 1 | Integer | Number of simultaneous threads to use for individual CPT inferences. Threads are per classifier node (dataset), so the number of threads actually used is the minimum of -t and the number of datasets. |
-x | on | Flag | If on, assume XDSL files will be used instead of DSL files. |
-u | on | Flag | If on, group identical examples into one heavily weighted example. This greatly improves efficiency, and there's essentially never a reason to deactivate it. |