Sleipnir
Normalizer

Normalizer will perform simple normalization of PCL files (Sleipnir::CPCL::Normalize), transforming each gene's expression vector to have mean zero and standard deviation one, or of DAT/DAB files (Sleipnir::CDat::Normalize), transforming all values to either z-scores or the range [0, 1].

Usage

Basic Usage

 Normalizer -i <data.dab> -o <normalized.dab> -t dat

Normalize the given data.dab file by z-scoring all values (subtracting the mean and dividing by the standard deviation) and store the result in normalized.dab.

 Normalizer -i <data.pcl> -o <normalized.pcl> -t pcl

Normalize the given data.pcl file by z-scoring each row independently (guaranteeing each row has mean zero, standard deviation one) and store the result in normalized.pcl.

Detailed Usage

package "Normalizer"
version "1.0"
purpose "Data file normalizer."

section "Main"
option  "input"     i   "Input/output PCL/DAT/DAB file"
                        string  typestr="filename"
option  "output"    o   "Output PCL/DAB file"
                        string  typestr="filename"
option  "itype"     t   "Data file type"
                        values="pcl","dat"  default="dat"
option  "otype"     T   "Normalization type"
                        values="columnz","rowz","globalz","column0","0to1","colcenter","medmult","colfrac","sigmoid","normcdf","pcc"    default="globalz"

section "Optional"
option  "flip"      f   "Flip high/low scores"
                        flag    off
option  "skip"      s   "Columns to skip in input PCL"
                        int default="2"
option  "verbosity" v   "Message verbosity"
                        int default="5"
Flag Default Type Description
-i stdin PCL or DAT/DAB file Input data file to be normalized, PCLs by row (mean zero/stdev one) and DAT/DABs to [0,1] or z-scores.
-o stdout PCL or DAT/DAB file Output normalized data file.
-t dat dat or pcl Type of data file to be normalized.
-f off Flag If on, output one minus the input's values.
-z off Flag If on, normalize input edges to z-scores (subtract mean, divide by standard deviation) before processing; otherwise, normalize to the range [0,1].
-s 2 Integer Number of columns to skip between the initial ID column and the first experimental (data) column in the input PCL.