⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 datasets.m

📁 The pattern recognition matlab toolbox
💻 M
字号:
%DATASETS Info on the dataset class construction for PRTools%% This is not a command, just an information file.%% Datasets in PRTools are in the MATLAB language defined as objects of the% class DATASET. Below, the words 'object' and 'class' are used in the pattern % recognition sense.%% A dataset is a set consisting of M objects, each described by K features. % In PRTools, such a dataset is represented by a M x K matrix: M rows, each% containing an object vector of K elements. Usually, a dataset is labeled.% An example of a definition is:%% DATA = [RAND(3,2) ; RAND(3,2)+0.5];% LABS = ['A';'A';'A';'B';'B';'B'];%% A = DATASET(DATA,LABS)% which defines a [6 x 2] dataset with 2 classes.%% The [6 x 2] data matrix (6 objects given by 2 features) is accompanied by% labels, assigning each of the objects to one of the two classes A and B.% Class labels can be numbers or strings and should always be given as rows% in the label list. A lable may also have the value NaN or may be an empty% string, indicating an ulabeled object. If the label list is not given, % all objects are marked as unlabeled.%% Various other types of information can be stored in a dataset. The most% simple way to get an overview is by typing:%% STRUCT(A)%% which for the above example displays the following:%%         DATA: [6x2 double]%      LABLIST: [2x1 double]%         NLAB: [6x1 double]%      LABTYPE: 'crisp'%      TARGETS: []%      FEATLAB: [2x1 double]%      FEATDOM: {1x2 cell  }%        PRIOR: []%         COST: []%      OBJSIZE: 6%     FEATSIZE: 2%        IDENT: {6x1 cell  }%      VERSION: {1x2 cell  }%         NAME: []%         USER: []%% These fields have the following meaning:% % DATA     : an array containing the objects (the rows) represented by  %            features (the columns). In the software and help-files, the number%            of objects is usually denoted by M and the number of features is%            denoted by K. So, DATA has the size of [M,K]. This is also defined %            as the size of the entire dataset.% LABLIST  : The names of the classes, stored row-wise. These class names%            should be integers, strings or cells of strings. Mixtures of%            these are not supported. LABLIST has as many rows as there are %            classes. This number is usually denoted by C. LABLIST is%            constructed from the set of LABELS given in the DATASET command%            by determining the unique names while ordering them alphabetically.% NLAB     : an [M x 1] vector of integers between 1 and C, defining for each%            of the M objects its class.% LABTYPE  : 'CRISP', 'SOFT' or 'TARGETS' are the three possible label types.%            In case of 'CRISP' labels, a unique class, defined by NLAB, is%            assigned to each object, pointing to the class names given in%            LABLIST.%            For 'SOFT' labels, each object has a corresponding vector of C %            numbers between 0 and 1 indicating its membership (or confidence %            or posterior probability) of each of the C classes. These numbers%            are stored in the array TARGETS of the size M x C. They don't%            necessarily sum to one for individual row vectors.%            Labels of type 'TARGETS' are in fact no labels, but merely target%            vectors of length C. The values are again stored in TARGETS and%            are not restricted in value.% TARGETS  : [M,C] array storing the values of the soft labels or targets.% FEATLAB  : A label list (like LABLIST) of K rows storing the names of the%            features.% FEATDOM  : A cell array describing for each feature its domain.% PRIOR    : Vector of length C storing the class prior probabilities. They %            should sum to one. If PRIOR is empty ([]) it is assumed that the%            class prior probabilities correspond to the class frequencies.% COST     : Classification cost matrix. COST(I,J) are the costs%            of classifying an object from class I as class J. Column C+1%            generates an alternative reject class and may be omitted, %            yielding a size of [C,C]. An empty cost matrix, COST = [] %            (default) is interpreted as COST = ONES(C) - EYE(C) (identical%            costs of misclassification).% OBJSIZE  : The number of objects, M. In case the objects are related to a%            n-dimensional structure, OBJSIZE is a vector of length n, storing%            the size of this structure. For instance, if the objects are pixels%            in a [20 x 16] image, then OBJSIZE = [20,16] and M = 320.% FEATSIZE : The number of features, K. In case the features are related to %            an n-dimensional structure, FEATSIZE is a vector of length n, %            storing the size of this structure. For instance, if the features%            are pixels in a [20 x 16] image, then FEATSIZE = [20,16] and %            K = 320.% IDENT    : A cell array of M elements storing indicators of the M objects.%            They are initialized by integers 1:M.% VERSION  : Some information related to the version of PRTools used for%            defining the dataset.% NAME     : A character string naming the dataset, possibly used to annotate%            related graphics.% USER     : Free field for the user, not used by PRTools.%%% The fields can be set in the following ways:%% 1.In the DATASET construction command after DATA and LABELS using the form of%   {field name, value pairs}, e.g.%   A = DATASET(DATA,LABELS,'PRIOR',[0.4 0.6],'FEATLIST',['AA';'BB']);%   Note that the elements in PRIOR refer to classes as they are ordered in%   LABLIST.% 2.For a given dataset A, the fields may be changed similarly by the SET command:%   A = SET(A,'PRIOR',[0.4 0.6],'FEATLIST',['AA';'BB']);% 3.By the commands%   SETDATA, SETFEATLAB, SETFEATDOM, SETFEATSIZE, SETIDENT, SETLABELS,%   SETLABLIST, SETLABTYPE, SETNAME, SETNLAB, SETOBJSIZE, SETPRIOR,%   SETTARGETS, SETUSER.% 4.By using the dot extension as for structures, e.g. %   A.PRIOR = [0.4 0.6];%   A.FEATLIST = ['AA';'BB'];% Note that there is no field LABELS in the DATASET definition. Labels are% converted to NLAB and LABLIST. Commands like SETLABELS and A.LABELS, however,% exist and take care of the conversion.%% The data and information stored in a dataset can be retrieved as follows:%% 1.By DOUBLE(A) and by +A, the content of the A.DATA is returned.%   [N,LABLIST] = CLASSSIZES(A); %   It returns the numbers of objects per class and  the class names stored %   in LABLIST.%   By DISPLAY(A), it writes the size of the dataset, the number of classes %   and the label type on the terminal screen.%   By SIZE(A), it returns the size of A.DATA: numbers of objects and features.%   By SCATTERD(A), it makes a scatter plot of a dataset.%   By SHOW(A), it may be used to display images that are stored as features %   or as objects in a dataset. % 2.By the GET command, e.g: [PRIOR,FEATLIST] = GET(A,'PRIOR','FEATLIST');% 3.By the commands:%   GETDATA, GETFEATLAB, GETFEATSIZE, GETIDENT, GETLABELS, GETLABLIST, %   GETLABTYPE, GETNAME, GETNLAB, GETOBJSIZE, GETPRIOR, GETCOST, GETSIZE, GETTARGETS,%   GETTARGETS, GETUSER, GETVERSION.%   Note that GETSIZE(A) does not refer to a single field, but it returns [M,K,C].%   The following commands do not return the data itself, instead they return %   indices to objects that have specific identifiers, labels or class indices:%   FINDIDENT, FINDLABELS, FINDNLAB.% 4.Using the dot extension as for structures, e.g. %   PRIOR = A.PRIOR;%   FEATLIST = A.FEATLIST;%% Many standard MATLAB operations and a number of general MATLAB commands have % been overloaded for variables of the DATASET type.% Copyright: R.P.W. Duin, r.p.w.duin@prtools.org% Faculty EWI, Delft University of Technology% P.O. Box 5031, 2600 GA Delft, The Netherlands

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -