Sleipnir
|
LibSVMer performs SVM learning using the LibSVM library. It supports cross validation and reading from binary PCL files created by PCL2Bin.
LibSVMer -l <labels_file> -p <params_file> -i <data.bin> -o <output_directory> -a
The labels file is of the format (NOTE WELL: IN ALL THE FOLLOWING FORMATS DELIMITERS ARE TABS -- doxygen converts them to spaces automatically).
ACTA2 -1 ACTN4 1 ADAM10 -1 AGRN 1 AGTR1 -1 ALDOB -1 ALOX12 1 ANGPT2 1 APOA4 1 AQP1 1
where -1 indicates negative and 1 indicates positive. The examples must be separated with tabs.
Output is of the format
IGHV1-69 0 1.94073 DAG1 1 1.9401 FNDC3B 0 1.93543 HPGD -1 1.93181 TPSAB1 0 1.92928 CLIC5 1 1.92759
where the first column is the example name, the second column is the gold standard status (matching labels) and the third column is the prediction from the SVM.
The params_file is of the format
10 0.1 0.5 10 0.01 0.5 10 0.001 0.5 10 0.0001 0.5 10 0.00001 0.5 10 0.000001 0.5
where the first column represents the error function, the second column represents the tradeoff constant and the third column represents k_value (for precision at k recall, but unused for the AUC error function in the example above.
LibSVMer can also be used to output a model or learn a network, although currently those features are undocumented.
package "LibSVMer"
version "1.0"
purpose "Wrapper for LibSVM"
section "Main"
option "labels" l "Labels file"
string typestr="filename" no
option "output" o "Output file "
string typestr="filename" no
option "input" i "Input PCL file "
string typestr="filename" yes
option "model" m "Model file"
string typestr="filename" no
option "all" a "Always classify all genes in PCLs"
flag off
section "Options"
option "skip" s "Number of columns to skip in input pcls"
int default="2" no
option "normalize" n "Normalize PCLS to 0 mean 1 variance"
flag off
option "cross_validation" c "Number of cross-validation sets ( arg of 1 will turn off cross-validation )"
int default="5" no
option "num_cv_runs" r "Number of cross-validation runs"
int default="1" no
option "svm_type" v "Sets type of SVM (default 0)
0\tC-SVC
1\tnu-SVC
2\tone-class SVM\n"
int default="0" no
option "balance" b "weight classes such that C_P * n_P = C_N * n_N"
flag off
option "tradeoff" t "SVM tradeoff constant C of C-SVC"
float default="1" no
option "nu" u "nu parameter of nu-SVC, one-class SVM"
float default="0.5" no
option "mmap" M "Memory map binary input"
flag off
Flag | Default | Type | Description |
---|---|---|---|
-i | None | PCL/BIN file | Input PCL file |
-o | None | Directory | Output directory. |
-l | None | Labels file | The file with examples formatted as noted above. |
-m | None | Model file | If present, output the learned model to this file. |
-a | off | Flag | If on output predictions for all genes in the PCL. |
-S | off | Flag | If on, use slack rescaling. |
-s | 2 | int | Number of columns to skip from PCL file. |
-n | off | Flag | Normalize PCL to 0 mean, 1 variance. |
-c | 5 | int | Number of cross validation intervals. |
-e | 10 | int | Which loss function should be used? (options: 0, 1, 2, 3, 4, 5, 10). |
-k | 0.5 | float | value of k for precision or recall. |
-t | 1 | float | SVM tradeoff constant C (note that this differs from the version in SVM light by a constant factor, check LibSVM docs for details). |
-p | None | Filename | Parameters file (to test with multiple parameters). |
-M | off | Flag | Memory map binary input PCLs (BIN files). |