|
Sleipnir
|
SVMer learns and evaluates support vector machine models from DAT/DAB datasets in a variety of ways. If given PCL inputs, SVMer will construct one example per gene pair by concatenating the two genes' expression vectors to create features. If given DAT/DAB inputs, SVMer will construct one example per gene pair using each dataset as a feature. In genewise mode, SVMer will learn one model per gene, with one example constructed for each pair in which that gene participates.
SVMer -i <answers.dab> -m <learned.svm> <data.pcl/dab>*
Learn an SVM model learned.svm for gene pairs using labels from answers.dab and data from data.pcl (for features built from PCL conditions) or data.dab (for feature values drawn from DAT/DAB files).
SVMer -m <learned.svm> -o <predictions.dab> <data.pcl/dab>*
Using the SVM model in learned.svm, predict labels for gene pairs using data from data.pcl or data.dab and store the resulting predicted functional interaction network in predictions.dab.
package "SVMer"
version "1.0"
purpose "SVM training and evaluation"
section "Main"
option "input" i "Input answer DAT/DAB file"
string typestr="filename"
option "output" o "Output prediction DAT/DAB file"
string typestr="filename"
option "model" m "SVM model file or directory"
string typestr="filename/directory"
section "Feature Mode"
option "pcl" p "PCL input mode"
flag on
option "binary" b "Input binary training file"
string typestr="filename"
option "genewise" w "Learn per-gene SVMs for pairwise predictions"
flag off
option "genel" l "Gene skip file for per-gene SVMs"
string typestr="filename"
section "Learning/Evaluation"
option "genes" g "Gene inclusion file"
string typestr="filename"
option "genex" G "Gene exclusion file"
string typestr="filename"
option "genet" c "Term inclusion file"
string typestr="filename"
section "SVM"
option "kernel" k "SVM kernel function"
values="linear","poly","rbf" default="linear"
option "cache" e "SVM cache size"
int default="40"
option "tradeoff" C "Classification tradeoff"
float
option "gamma" M "RBF gamma"
float default="1"
option "degree" d "Polynomial degree"
int default="3"
option "alphas" a "SVM alphas file"
string typestr="filename"
option "iterations" t "SVM iterations"
int default="100000"
section "Optional"
option "skip" s "Columns to skip in input PCLs"
int default="2"
option "random" r "Seed random generator"
int default="0"
option "verbosity" v "Message verbosity"
int default="5"
| Flag | Default | Type | Description |
|---|---|---|---|
| None | None | PCL or DAT/DAB files | Input data files from which features are constructed, either PCLs from which expression vectors are concatenated or DAT/DABs from which pairwise values are read. |
| -i | stdin | DAT/DAB file | If given, functional gold standard for learning. Should consist of gene pairs with scores of 0 (unrelated), 1 (related), or missing (NaN). If not given, evaluation is assumed and SVM model(s) is/are read from -m. |
| -o | stdout | DAT/DAB file | Output predictions from the SVM model(s) for each available gene pair. |
| -m | None | SVM model file or directory | In standard mode, output learned SVM model file (if -i is given) or input SVM model file to be evaluated (if it is not). If genewise mode, directory containing output learned or input evaluated SVM model files. |
| -p | on | Flag | If on, assume input files are PCLs from which features are constructed by concatenation of expression vectors. If off, assume input files are DAT/DABs from which one feature is drawn per dataset for each gene pair example. |
| -b | None | Binary feature file | If given, ignore other inputs and assume the given binary file is to be used for model evaluation (if -o is specified) or learning (if it is not). |
| -w | off | Flag | If on, learn/evaluate one SVM model per gene, using only the gene pairs including that gene (and thus each example represents one other gene). If off, learn/evaluate one global SVM model in which each feature represents a gene pair. |
| -l | None | Gene text file | If given, in genewise mode, learn/evaluate models only for genes in the given gene set. |
| -g | None | Text gene list | If given, use only gene pairs for which both genes are in the list. For details, see Sleipnir::CDat::FilterGenes. |
| -G | None | Text gene list | If given, use only gene pairs for which neither gene is in the list. For details, see Sleipnir::CDat::FilterGenes. |
| -c | None | Text gene list | If given, use only gene pairs passing a "term" filter against the list. For details, see Sleipnir::CDat::FilterGenes. |
| -k | linear | linear, poly, or rbf | SVM kernel type: linear, polynomial, or radial basis function. |
| -e | 40 | Integer (MB) | SVM cache size in megabytes. |
| -C | None | Float | SVM tradeoff between misclassification and margin; an appropriate default is calculated if no value is given. |
| -M | 1 | Float | Gamma parameter for RBF kernel. |
| -d | 3 | Integer | Degree parameter for polynomial kernel. |
| -a | None | Alphas file | If given, SVM Light alphas file used to initialize the SVM model. |
| -t | 100000 | Integer | Maximum number of iterations to run per SVM learning epoch. |
| -s | 2 | Integer | Number of columns to skip in any PCL data files between the initial ID column and the experimental data columns. Must be the same number for all PCL files. |
1.7.6.1