Sleipnir
Clinician

Clinician performs multiple correlation tests of a clinical (or any non-expression) variable against genomewide expression values. This can be used to determine transcript correlates of a molecular or clinical phenotype, and HEFalMp/bioPIXIE queries can be used as a pre-screen to mitigate the effects of multiple hypothesis testing.

Usage

Basic Usage

 PCLPlotter -i <data.pcl>

Produce a list of clinical correlates in the specially formatted data.pcl, which should have exactly one non-data column after the initial ID column (in place of the standard NAME/GWEIGHT columns). This column should contain 0 for standard expression values and 1 for clinical variables with which they are to be correlated.

 PCLPlotter -i <data.pcl> -I <network.dab>

Produce a list of clinical correlates in the specially formatted data.pcl, formatted as described above, but pre-screen the correlates using the interaction network network.dab. For each clinical variable, a small number of top correlates will be pre-selected and used as a HEFalMp/bioPIXIE query into the given interaction network. Only the nearest neighbors from this query will be tested for significant clinical correlation, reducing the number of necessary multiple hypothesis tests.

Detailed Usage

package "Clinician"
version "1.0"
purpose "Calculates significance of clinical variables associated with genomewide expression."

section "Main"
option  "input"             i   "Input PCL file"
                                string  typestr="filename"
option  "global"            I   "Input DAT/DAB file"
                                string  typestr="filename"

section "Miscellaneous"
option  "initial"           n   "Initial correlated neighbor count"
                                int default="100"
option  "final"             N   "Final query result count"
                                int default="1000"
option  "hefalmp"           a   "Perform HEFalMp query instead of bioPIXIE query"
                                flag    on
option  "spearman"          p   "Use Spearman in place of Pearson correlation"
                                flag    off

section "Optional"
option  "skip"              s   "Columns to skip in input PCL"
                                int default="1"
option  "memmap"            m   "Memory map input file"
                                flag    off
option  "verbosity"         v   "Message verbosity"
                                int default="5"
Flag Default Type Description
-i stdin PCL file Input PCL file from which expression and clinical variables are read. Must be formatted with exactly one, rather than the standard two, non-ID columns. Rows containing a 0 in this column will be treated as gene expression, rows containing a 1 will be treated as clinical correlates.
-I None DAT/DAB file If given, input DAT/DAB file used to pre-screen potential clinical correlates.
-n 100 Integer If given, number of top correlates used during pre-screening as a HEFalMp/bioPIXIE query.
-N 1000 Integer If given, number of neighbors retrieved from a HEFalMp/bioPIXIE query to reduce multiple hypothesis testing.
-a On Flag If given, perform a HEFalMp rather than bioPIXIE query.
-s 1 Integer Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.
-m off Flag If given, memory map the input files when possible. DAT and PCL inputs cannot be memmapped.