Sleipnir
|
DataDumper opens an answer file, multiple datasets, and a Bayesian network and displays the data exactly as it would be provided to the network for Bayesian learning or evaluation. This provides information on exactly what genes/gene pairs and data are being used for learning under different circumstances.
DataDumper -w <answers.dab> <data.dab>*
Output (to standard output) the discretized answers for all gene pairs in answers.dab
with data from the DAT/DAB files data.dab
(and associated QUANT files) exactly as it would be used in Bayesian learning. In combination with the other command line arguments, this allows a user to see exactly what data is being used for learning/evaluation under specific circumstances, e.g. context-specific learning, a holdout test set, etc.
package "DataDumper"
version "1.0"
purpose "Examination of data used for large scale Bayes net learning"
section "Main"
option "answers" w "Answer file"
string typestr="filename" yes
section "Learning/Evaluation"
option "genes" g "Gene inclusion file"
string typestr="filename"
option "genex" G "Gene exclusion file"
string typestr="filename"
option "genet" c "Term inclusion file"
string typestr="filename"
section "Network Features"
option "zero" z "Zero missing values"
flag off
option "zeros" Z "Read zeroed node IDs/outputs from the given file"
string typestr="filename"
section "Optional"
option "verbosity" v "Message verbosity"
int default="5"
Flag | Default | Type | Description |
---|---|---|---|
None | None | DAT/DAB files | Input DAT/DAB files from which data is drawn for display in the output. |
-w | None | DAT/DAB file | Functional gold standard for learning. Should consist of gene pairs with scores of 0 (unrelated), 1 (related), or missing (NaN). |
-g | None | Text gene list | If given, use only gene pairs for which both genes are in the list. For details, see Sleipnir::CDat::FilterGenes. |
-G | None | Text gene list | If given, use only gene pairs for which neither gene is in the list. For details, see Sleipnir::CDat::FilterGenes. |
-c | None | Text gene list | If given, use only gene pairs passing a "term" filter against the list. For details, see Sleipnir::CDat::FilterGenes. |
-z | off | Flag | If on, assume that all missing gene pairs in all datasets have a value of 0 (i.e. the first bin). |
-Z | None | Tab-delimited text file | If given, argument must be a tab-delimited text file containing two columns, the first node IDs and the second bin numbers (zero indexed). For each node ID present in this file, missing values will be substituted with the given bin number. |