Sleipnir
|
SeekServer runs the coexpression mining algorithm using a multithreaded TCP/IP interface. When it is running, SeekServer services requests from multiple clients over the network. The requests that can be handled by SeekServer are, for example:
SeekServer -t <port> -x <dset_platform_map> -i <gene_map> -d <db_dir> -p <prep_dir> -P <platform_dir> -Q <quant> -n <num_db> -u <sinfo_dir>
This starts an instance of SeekServer on the indicated port and begins accepting client requests.
When a client request comes in, SeekServer looks for the following sequence of 4 strings in the request message:
strSearchDataset
. Dataset names, as referred by the dset_platform_map
, to be used for the search. Delimited by " ".strQuery
. Query gene names, as referred by the gene_map
, separated by " ".strOutputDir
. Output directory where intermediate results are generated. Must be a directory that the running user of SeekServer has access to. /tmp
is recommended.strSearchParameter
. A string of the form "1_2_3_4" where each number denotes the following: RBP
, OrderStatistics
, EqualWeighting
. Recommended RBP
(also known as the CV weighting). float
0.90 - 0.99). Recommended 0.99. Correlation
, Zscore
, ZscoreHubbinessCorrected
. Recommended ZscoreHubbinessCorrected
.See Sleipnir::CSeekNetwork for the specification of an incoming string message.
Once SeekServer correctly receives the above 4 strings, a search instance using the provided search parameters will be initiated on the server side.
Each outgoing message is generated upon finishing searching the client's query. In general, if the search is successful, SeekServer will send to the clients these two arrays in sequence:
float
array of dataset weights, indicating how datasets are related to the query. float
array of gene scores, indicating how genes are coexpressed with the query.See Sleipnir::CSeekNetwork for the specification of an outgoing float array.
These include the following: dset_platform_map
, gene_map
, db_dir
, prep_dir
, platform_dir
, quant
, sinfo_dir
. For a discussion of these files and directories, please refer to the SeekMiner page in section: Query-independent search setting files and directories.
package "SeekServer"
version "1.0"
purpose "Performs cross-platform microarray query-guided search in server mode"
section "Main"
option "port" t "Port"
string default="9000" yes
option "dset" x "Input a set of datasets"
string typestr="filename" yes
option "input" i "Input gene mapping"
string typestr="filename" yes
option "dir_in" d "Database directory"
string typestr="directory" yes
option "dir_prep_in" p "Prep directory (containing .gavg, .gpres files)"
string typestr="directory" yes
option "dir_platform" P "Platform directory (containing .gplatavg, .gplatstdev, .gplatorder files)"
string typestr="directory" yes
option "dir_sinfo" u "Sinfo Directory (containing .sinfo files)"
string typestr="directory" default="NA" yes
option "dir_gvar" U "Gene variance directory (containing .gexpvar files)"
string typestr="directory" default="NA"
option "quant" Q "quant file (assuming all datasets use the same quantization)"
string typestr="filename" yes
option "num_db" n "Number of databaselets in database"
int default="1000" yes
option "num_threads" T "Number of threads"
int default="8"
section "Optional - Parameter tweaking"
option "score_cutoff" c "Cutoff on the gene-gene score before adding, default: no cutoff"
float default="-9999"
option "square_z" e "If using z-score, square-transform z-scores. Usually used in conjunction with --score-cutoff"
flag off
section "MISC"
option "is_nibble" N "If true, the input DB is nibble type"
flag off
option "buffer" b "Number of Databaselets to store in memory"
int default="20"
option "output_text" O "Output results (gene list and dataset weights) as text"
flag off
option "additional_db" B "Utilize a second CDatabase collection. Path to the second CDatabase's setting file."
string default="NA"