⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 数据挖掘的工具箱,最新版的,希望对做这方面研究的人有用
💻
字号:
Data Description Matlab toolbox. (version 1.11)

This toolbox is an add-on to the PRTools toolbox. The toolbox contains
algorithms to train, investigate, visualize and evaluate one-class
classifiers (or data descriptions, novelty descriptors, outlier
detectors). Some experience with the PRTools toolbox is recommended.
This toolbo is developed as a research tool so no guarantees can be
given.

- Requirements:

In order to make this toolbox work, you need:
0. A computer and some enthusiasm
1. Matlab with the - optimization toolbox (for svdd and lpdd)
                   - statistics toolbox (for randsph)
               and - neural network toolbox (for autoenc_dd)
2. PRTools 4.0.0 or higher
3. This toolbox.

- Installation:

The installation of the toolbox is almost trivial. Unzip the file, store the
contents in a directory (name it for instance DD_TOOLS) and add this directory
to your matlab path.

- Information and example code:

For the most basic information, type  help DD_TOOLS (use the directory name
where the toolbox is stored). A simple one-class examples are given in
dd_ex1.m, dd_ex2, dd_ex3, dd_ex4. For more background information,
please have a look at the pdf file included in the directory.  Some
examples of the operation of the procedures in the toolbox are given on
the web-pages:
  http://www.ph.tn.tudelft.nl/~davidt/dd_tools.html

* Notes on version 1.11:
- Changed some implementation of newsvdd such that it uses the
  standard optimizer.
- Plotsom is now standard in Prtools, so removed from the Contents.m
- Included an index in the manual.

* Notes on version 1.10:

- Significantly rewrote and rearranged oc_set.m and target_class.m
- Changed dd_error.m and dd_roc.m to mimic testc.m
  Also included the computation of the precision and recall.
- Completely rewrote the ROC computation. Large amounts of complexity
  are just removed (and thus also some features, I'm sorry).
- Support the selection of hyperparameters using the consistency
  criterion.
- Added the robustified Gaussian (rob_gauss_dd) and the minimum
  covariance determinant Gaussian (mcd_gauss_dd).
- Removed a bad, bad, bad bug from gausspdf.m. 
- Made all the Gaussian methods use mahaldist.m for their evaluation.
- Completely rewrote the SVDD. The confusing parameters fracrej and
  fracerr are removed, and all the quadratic optimizers (libsvm, qld,
  quadprog) are integrated.
- Added the SVDD using general kernel definitions: ksvdd.m (although
  it has a very annoying feature that you have to supply the values
  for K(z,z) during the evaluation of object z, when you just supply
  the kernel matrix: have a look at the help)
- Rewrote the LPDD in terms of DLPDD, MYPROXM and DISSIM.
- Implemented the SOM now nicely and removed the most obvious bugs.
- Added the dd_crossval.
- Added the dd_f1, for the computation of the f1-score.


* Notes on version 1.01:

- Changed the order of the mtimes: so w*a is replaced by a*w
- Removed a bug in the creation of a one-class dataset from a
  more-than-two class dataset in oc_set.m

* Notes on version 1.00:

- There is a *significant* change from updating from prtools3 to
  prtools4 (prtools3.2.2 or higher). The definitions of the objects
  'dataset' and 'mapping' have been upgraded. This requires the rewriting
  of almost all code! It can therefore happen that new results are not
  identical to results obtained by previous versions of the tools (but
  they should not be very large).
- dd_error is totally rewritten
- names of is_ocset and is_occ are renamed to isocset and isocc to be
  more consistent with the rest of matlab and prtools
- som_dd is added.


* Notes on version 0.99:

- introduced dissim.m

* Notes on version 0.95:

- added a bit of help to each of the m-files.
- programmed my own very basic kmeans clustering, because I needed it
  also for other things. Therefore added  mykmeans.m
- added plotroc.m to plot the classical ROC curve
- made an extra check in dd_roc to see where the outputs of the target
  class is stored (for my OCC's it is always in the first column, but
  for general PRTools classifiers this does not have to be the case).
  Now dd_roc should work for all prtools classifiers (trained on data
  with 'target' and 'outlier' labels of course).
- dd_fp.m added: compute the error on the outliers (fraction false
  positive) of a trained classifier for a given error on the target
  class (fraction false negative).
- made my own version of proxm.m (myproxm.m) which uses the lpdistm.m.
  It is used in kwhiten.m.
- removed some horrible bug in lpdd! (one bloody minus sig...)
- another horrible bug from kwhiten, in the case a fixed
  dimensionality was requested... Furthermore, in case of a fraction
  of retained variance was requested, the threshold is now set such
  that *at least* this fraction is retained (could be higher also).
- corrected the nu parameter in svdd and newsvdd in cases when example
  outliers are used in training. Note that it cannot be done
  completely correctly in newsvdd, because there just one single nu
  parameter is allowed for all data.
- included in plotg.m the possibility to just plot the decision
  boundary.


* Notes on version 0.9:

! in the early versions of the svdd, the support vectors were
  classified as outliers. Now they are forced to be target objects.
  This will therefore change the classification results!
- added gendatout:  generation of spherically distributed outlier objects
- changed the place in which distm(a) was computed in the original
  version of svdd. In previous versions, it was done over and over
  again in f_svs, but now it is moved to the main svdd.m
- removed a bug in range_svdd, where the sqrt of the D has to be taken
  for the range of sigma.
- fixed a bug in dd_roc. Now it is possible to supply 1D datasets for
  computing the roc curve.
- fixed an error in the help of dd_auc
- added the function relabel
- replaced all explicit references of the function name by 'mfilename'
  in all one-class classifiers
- added the random_dd, which randomly assigns labels
- added lpdd.m, the linear programming data description. It works on
  distances, and therefore I also had to add:
  ddistm.m and lpdistm.m
- added kwhiten.m, normalization to unit variance in the kernel space.
  For that also center.m was needed.

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -