Sleipnir
Explainer

Explainer display information on the best- and/or worst-predicted gene pairs in a functional relationship network or, equivalently, the most and least similar gene pairs in two experimental datasets. This can be used to analyze predictions (to understand where errors are coming from) or to analyze experimental data (to see where and why laboratory results disagree with each other or a gold standard).

Usage

Basic Usage

 Explainer -i <predictions.dab> -w <answers.dab> [-f SGD_features.tab]

Output (to standard output) all gene pairs in predictions.dab as compared to answers.dab, sorted from greatest to least difference; the optional SGD_features.tab file can make this output more informative for networks containing yeast genes.

Detailed Usage

package "Explainer"
version "1.0"
purpose "Allows evaluation of genes contributing to a good prediction score"

section "Main"
option  "input"         i   "Similarity DAT/DAB file"
                            string  typestr="filename"  yes
option  "answers"       w   "Answer DAT/DAB file"
                            string  typestr="filename"  yes
option  "mode"          d   "Sort mode"
                            values="diff","data","answer"   default="diff"

section "Miscellaneous"
option  "count"         k   "Number of pairs to display"
                            int default="-1"
option  "positives"     p   "Include only positive pairs"
                            flag    off
option  "negatives"     P   "Include only negative pairs"
                            flag    off
option  "everything"    e   "Include pairs without answers"
                            flag    off
option  "unknowns"      u   "Treatment of unknown genes"
                            values="exclude","include","only"   default="exclude"
option  "fraction"      x   "Random fraction of results to calculate"
                            double  default="1"

section "Learning/Evaluation"
option  "genes"         g   "Gene inclusion file"
                            string  typestr="filename"
option  "genex"         G   "Gene exclusion file"
                            string  typestr="filename"
option  "genet"         R   "Term inclusion file"
                            string  typestr="filename"
option  "genee"         C   "Edge inclusion file"
                            string  typestr="filename"

section "Preprocessing"
option  "normalize"     n   "Normalize to the range [0,1]"
                            flag    off
option  "invert"        t   "Invert correlations to distances"
                            flag    off
option  "reverse"       r   "Reverse sort order"
                            flag    off

section "Function Catalogs"
option  "go_onto"       o   "GO ontology"
                            string  typestr="filename"
option  "go_anno"       a   "GO annotations"
                            string  typestr="filename"
option  "features"      f   "SGD gene features"
                            string  typestr="filename"

section "Optional"
option  "memmap"        m   "Memory map input files"
                            flag    off
option  "config"        c   "Command line config file"
                            string  typestr="filename"  default="Explainer.ini"
option  "verbosity"     v   "Message verbosity"
                            int default="5"
Flag Default Type Description
-i stdin DAT/DAB file Input DAT/DAB file to be compared against a gold standard answer file.
-w None DAT/DAB file Gold standard answer DAT/DAB file against which input predictions/data are compared.
-k -1 Integer Number of gene pair comparisons to output; -1 displays all pairs.
-p off Flag If on, output only gene pairs marked as positive (1) in the gold standard.
-e off Flag If on, output all gene pairs with any data; if off, only output gene pairs present in the gold standard.
-u exclude exclude, include, or only If exclude, do not output any gene pairs containing an uncharacterized gene. If include, allow gene pairs containing uncharacterized genes. If only, output only gene pairs containing uncharacterized genes.
-x 1 Double Randomly subsample the requested fraction of the possible output data.
-g None Text gene list If given, use only gene pairs for which both genes are in the list. For details, see Sleipnir::CDat::FilterGenes.
-G None Text gene list If given, use only gene pairs for which neither gene is in the list. For details, see Sleipnir::CDat::FilterGenes.
-c None Text gene list If given, use only gene pairs passing a "term" filter against the list. For details, see Sleipnir::CDat::FilterGenes.
-n off Flag If on, normalize input edges to the range [0,1] before processing.
-t off Flag If on, output one minus the input's values.
-r off Flag If on, sort output from least to greatest error; otherwise, sort from greatest to least error.
-o None OBO text file OBO file containing the structure of the Gene Ontology.
-a None Annotation text file Gene Ontology annotation file for the desired organism.
-f None SGD features text file If given, use gene names from the given SGD_features.tab file to label graph nodes.
-m off Flag If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped.