⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 统计学习软件包
💻
字号:
License This software is free only for non-commercial use. It must not be modifiedand  distributed without prior permission of the author. The author is notresponsible for implications from the use of this software. This program is distributed in the hope that it will be useful, but WITHOUTANY WARRANTY.                                                       General Description: This is an implementation of the Generalized LASSO methodfor classification. Details can be found in Roth, V. (2002),  The Generalized LASSO: a wrapper approach to gene selectionfor microarray data. University of Bonn, Dep. Computer Science III,Tech.Rep. IAI-TR-2002-8. Please cite this reference if you use this program.The user must provide a file which contains the feature values for eachsample (One feature per line, see below). Standardization for zero mean andunit variance usually is a very good idea. The optimal l_1 constraint value kappa is automatically found by averaging overrandomly chosen  80%/20% training/test splits of the dataset. The predictionperformance of the finally selected model is estimated on the basis of new randomly chosen splits.  A demo application for Microarray data (sample classification) isincluded.In order to obtain nonlinear variants of the Generalized LASSO, the featuresmay also be preprocessed by a Mercer kernel, resulting in a (N x N) input file. RBF-kernels usually workbest.   Installation:Download GenLASSO.tar.gz.Type "tar xfvz GenLASSO.tar.gz". This will install the sources intoa directory called GenLASSO.  The Generalized LASSO classifier uses part of the donlp2 optimization packageby P. Spellucci (to be exact the "donlp2_ansi_c.tar.gz" version rewritten inAnsi C  by Serge Schoeffert). The latter can be downloaded from  http://plato.la.asu.edu/donlp2.html. Please unzip/untar "donlp2_ansi_c.tar.gz"  and copy  "donlp2.c" and"user_eval.c" and all header-files "*.h" into your GenLASSO directory beforerunning the make program. Edit the Makefile and eventually adjust your compiler settings.Type "make", close your eyes, stop breathing and hope for the best. If you arelucky, this should create the executable "GenLASSO_C". Otherwise follow yourintuition. Re-start breathing in any case. To avoid panic, however, eyes mightbe kept close. Usage:"GenLASSO_C file_name", where file_name specifies a configuration file.Syntax of configuration file:1. line: <number of samples>  <number of features> -> The problem dimensions2. line: <initial constraint value> <constraint increment> <number ofincrements>  -> specifies the grid-search for the optimal l_1 constraint value kappa. 3. line: <Number of cross-validation iterations>  -> Number of training/test splits of dataset used for model selection and  assessment 4. line: <data filename> -> Name of file containing both the class labels (first line, either 0 or 1) and the  (pre-processed)  expression levels (one line per gene). See also the example datafile "golub.dat" 5. line: <annotation filename>  -> Name of file containing feature (gene) annotations, one per line, See  also example  file "golub_names.dat"6. line: <reduce flag>  -> if set to one, the model is re-trained on a subset of genes, consisting  of those genes which have stability scores > 0.05 in the first run.See also the included example parameter file "golub_config.dat".Demo:Preprocessed data from (Golub et al.) (www-genome.wi.mit.edu/mpr/data_set_ALL_AML.html):Original dataset consists of 72 samples, of which 47 are acute lymphoblasticleukemia (ALL),  and 25 samples are acute myeloid leukemia (AML). Expressionlevels of 7129 genes were measured using Affymetrix high-densityoligonucleotide arrays. We preprocessed the data by firstly excluding geneswith mostly  negative intensity values, and secondly by excluding genes withmax/min <2   and max-min < 1000. This leaves us with a reduced set of 1479genes. Finally, the data were log-transformed, "squashed" through atanh-function (for outlier--reduction), and standardized to zero mean andunit variance across samples. Class labels and expression values aresummarized in file "golub.dat".To run the demo application, type  "GenLASSO_C golub_config.dat".   Results (including stability scores and annotations) will be summarized inthe file "relevant_genes.dat": model parameters, cross-validation error(standard deviation), average number of "relevance vectors" (RVs), stabilityscores and gene annotations.Graphical output is provided by the file "rg.pgm". The error rates for thresholding by different confidence levels  and thefraction of samples contained in the doubt class are printed in file"doubt.dat". Format per line: <threshold> <error rate> <fraction of samples indoubt class>.  

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -