Sleipnir
Dat2Graph

Dat2Graph processes an input DAT/DAB into a visual graph representation in one of several possible formats. Optionall, the DAT/DAB can also be normalized, filtered, or queried using the bioPIXIE or HEFalMp algorithms to inspect the neighborhood around a given gene set.

Usage

Basic Usage

 Dat2Graph -i <data.dab> -e <cutoff>

Output (to standard output) a DOT file containing all edges in data.dab greater than cutoff (and the nodes to which they are incident).

 Dat2Graph -i <data.dab> -q <genes.txt> [-l <colors.txt>] [-b <borders.txt>]

Output a DOT file containing the subgraph of data.dab resulting from using the gene list in genes.txt as a HEFalMp query, with nodes optionally colored using the [0,1] values from colors.txt or given optional border widths using the pixel counts in borders.txt.

 Dat2Graph -i <data.dab> -t dat -q <genes.txt> -a

Output a DAT file containing the subgraph of data.dab resulting from using the gene list in genes.txt as a bioPIXIE query.

Detailed Usage

package "Dat2Graph"
version "1.0"
purpose "Text/binary data file graph output"

section "Main"
option  "input"     i   "Input DAT/DAB file"
                        string  typestr="filename"
option  "format"    t   "Output graph format"
                        values="dot","gdf","net","matisse","list","dat","correl"    default="dot"

section "Graph Queries"
option  "geneq"     q   "Query inclusion file"
                        string  typestr="filename"
option  "genew"     Q   "Query weights file"
                        string  typestr="filename"
option  "neighbors" k   "Size of query neighborhood"
                        int default="-1"
option  "hefalmp"   a   "Perform HEFalMp query instead of bioPIXIE query"
                        flag    on
option  "edges"     d   "Aggressiveness of edge trimming after query"
                        double  default="1"
option  "hubs"      H   "Number of neighbors to query hubs"
                        int default="-1"

section "Filtering"
option  "cutoff"    e   "Minimum edge weight for output"
                        double
option  "genes"     g   "Gene inclusion file"
                        string  typestr="filename"
option  "genex"     G   "Gene exclusion file"
                        string  typestr="filename"
option  "knowns"    w   "Known interactions (DAT/DAB) to ignore"
                        string  typestr="filename"

section "Annotation"
option  "features"  f   "SGD gene features"
                        string  typestr="filename"
option  "colors"    l   "Colors for graph nodes"
                        string  typestr="filename"
option  "borders"   b   "Borders for graph nodes"
                        string  typestr="filename"

section "Optional"
option  "normalize" n   "Normalize edge weights before processing"
                        flag    off
option  "absolute"  A   "Use absolute value of edge weights"
                        flag    off
option  "memmap"    m   "Memory map input file"
                        flag    off
option  "config"    c   "Command line config file"
                        string  typestr="filename"  default="Dat2Graph.ini"
option  "verbosity" v   "Message verbosity"
                        int default="5"
Flag Default Type Description
-i stdin DAT/DAB file Input DAT/DAB file.
-t dot dot, gdf, net, matisse, list, dat, or correl Output format. dot produces a Graphviz DOT, gdf a GUESS GDF, net a NET file, matisse a MATISSE file, list a list of genes that would be in the output graph, dat a DAT of edges that would be in the output graph, and correl outputs the correlation between each gene's average edge vector to the query and that gene's edge vector to all genes.
-q None Gene text file If given, output graph is generated by performing a bioPIXIE or HEFalMp query against the input DAT/DAB using the requested gene set.
-k -1 Integer Number of neighbor genes to be included with the query in a bioPIXIE or HEFalMp result. If -1, the algorithm's default value is used.
-a on Flag If on, perform a HEFalMp ratio query; if off, perform a bioPIXIE maximum sum query.
-d 1 Double Dictates how aggressively edges are removed from the results of a bioPIXIE or HEFalMp query. Larger values remove more edges, smaller values (can be negative) remove fewer.
-e None Double If given, remove all input edges below the given cutoff (after optional normalization).
-g None Text gene list If given, use only gene pairs for which both genes are in the list. For details, see Sleipnir::CDat::FilterGenes.
-w None DAT/DAB file If given, ignore all edges present in the given DAT/DAB file during graph generation.
-f None SGD features text file If given, use gene names from the given SGD_features.tab file to label graph nodes.
-l None Text file If given, text file containing one floating point value per line; must contain exactly one line per node in the output graph. These values are used to scale between cyan (0), white (0.5), and yellow (1) node colors.
-b None Text file If given, text file containing one floating point value per line; must contain exactly one line per node in the output graph. These values are used to determine the border width (in pixels) of each node.
-n off Flag If on, normalize input edges to the range [0,1] before processing.
-m off Flag If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped.