Sleipnir: PCLPlotter

PCLPlotter produces summary statistics from a PCL file and is optimized to show the mean expression values for subsets (biclusters) of genes and conditions. It can also provide accompanying bicluster statistics for associated FASTA files containing gene sequences.

Usage

Basic Usage

 PCLPlotter -i <data.pcl>

Produce a summary of the mean and standard deviation expression for each condition in data.pcl. If a bicluster is present, member genes should be marked with an initial * in their NAME (not ID) column, and member conditions should be marked with an initial *.

 PCLPlotter -i <cluster.pcl> -b <genome.pcl>

Produce a summary of the mean and standard deviation expression for each condition in cluster.pcl and genome.pcl, which must contain the same conditions. Genes in cluster.pcl are considered to be members of the bicluster, and member conditions should be marked with an initial *.

 PCLPlotter -i <data.pcl> -g <genes.txt>

Produce a summary of the mean and standard deviation expression for each condition in data.pcl. Genes in genes.txt are considered to be members of the bicluster, and member conditions should be marked with an initial *.

 PCLPlotter -i <data.pcl> -f <data.fasta>

Produce a summary of the mean and standard deviation expression for each condition in data.pcl, as well as an HMM summarizing sequence characteristics. If a bicluster is present, member genes should be marked with an initial * in their NAME (not ID) column, and member conditions should be marked with an initial *.

Detailed Usage

package "PCLPlotter"
version "1.0"
purpose "Plots summary information for PCL files/clusters."

section "Main"
option  "input"             i   "Input PCL file"
                                string  typestr="filename"
option  "fasta"             f   "Gene sequence file"
                                string  typestr="filename"

defgroup "Foreground_Background"
groupoption "background"    b   "Background PCL file"
                                string  typestr="filename"  group="Foreground_Background"
groupoption "genes"         g   "Foreground gene list"
                                string  typestr="filename"  group="Foreground_Background"
groupoption "motifs"        m   "Known motif list"
                                string  typestr="filename"  group="Foreground_Background"

section "Optional"
option  "k"                 k   "Length of motif words"
                                int default="7"
option  "degree"            d   "Degree of HMM for sequence summary"
                                int default="0"
option  "skip"              s   "Columns to skip in input PCL"
                                int default="2"
option  "verbosity"         v   "Message verbosity"
                                int default="5"

Flag	Default	Type	Description
-i	stdin	PCL file	Input PCL file from which bicluster summary information is extracted. In the absence of `-b` or `-g` options, genes and conditions in the bicluster should be marked with a `*` at the beginning of their NAME and label, respectively.
-f	None	FASTA file	If given, input FASTA sequence file from which cluster sequence summary information is extracted.
-b	None	PCL file	If given, input PCL file from which non-bicluster summary information is extracted; all genes in `-i` are considered to be in the bicluster, and conditions should be marked with a `*`. PCL files for `-i` and `-b` should contain exactly the same conditions.
-g	None	Text gene list	If given, input text file from which biclustered genes are read; other genes in `-i` are considered to be out of the bicluster.
-m	None	Text motif list	If given, input text file from which known motifs are read. In conjunction with `-f`, frequencies of each motif in gene sequences in and out of the bicluster will be provided.
-M	7	Integer (base pairs)	Default number of base pairs per motif; largely unrelated to the contents of `-m`.
-k	0	Integer	Degree of HMM used to provide summary statistics of sequences given in `-f`.
-s	2	Integer	Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.