📄 softmax.m

📁 国外编的信号识别的程序
💻 M
📖 第 1 页 / 共 2 页
字号:
12 下一页
function [f, iter, dev, hess] = softmax(X, k, prior, varargin)%SOFTMAX Multinomial feed-forward neural-network%   F = SOFTMAX(X, K, PRIOR) returns a SOFTMAX object containing the%   weights of a feed-forward neural network trained to minimise the%   multinomial log-likelihood deviance based on the feature matrix X,%   class indeces in K and the prior probabilities in PRIOR where%   PRIOR is optional. See the help for SOFTMAX's parent object class%   CLASSIFIER for information on the input arguments X, K, and%   PRIOR. Traditional neural networks minimise the sum squared error,%   whereas this model assumes that the outputs are a Poisson process%   conditional on their sum and calculates the error as the residual%   deviance.%%   In addition to the fields defined by the CLASSIFIER class, F%   contains the following field:%%   WEIGHTS: a sparse matrix representing the optimised connection%   weights where rows represent connections from units that feed%   other units and columns represent connections to units that are%   fed by other units. Each non-zero value in this matrix represents%   a weight connecting unit i to unit j where i is the row and j is%   the column. There are p+1 input units that are not fed by other%   units (i.e., the first p+1 columns are all zeros). The first unit%   always represents the bias, while the following p units represent%   the inputs to the entire network (i.e., from X). In addition there%   are g-1 output units where g represents the number of different%   classes in k. Because output probabilities are normalised over the%   sum of the exponents, it is assumed that the first class recieves%   all zero weights and is therefore not explicitly represented in%   the weight matrix. Output units do not feed other units and%   therefore there are g-1 less rows than columns (assuming that the%   missing rows are all zero). All other units are referred to as%   hidden units and both feed and are fed by other units.%%   Because of argument structure ambiguity, PRIOR is not optional%   when using other options. The default can be assigned by giving%   an empty PRIOR = [].%%   SOFTMAX(X, K, PRIOR, NUNITS, SKIP) where NUNITS is a scalar%   positive integer and SKIP is either 0 or 1 specifies how many%   hidden units are present in a single hidden layer neural%   network. The model is fully connected between adjacent layers. If%   SKIP is 1, input units are additionally connected to output%   units. SKIP must be specified when there is only a single hidden%   layer. If NUNITS is 0, SKIP must also be 0.%%   SOFTMAX(X, K, PRIOR, NUNITS) where NUNITS is a vector of positive,%   non-zero integers of length n specifies how many units are present%   in each of n hidden layers. All adjacent layers are fully%   connected, however it is an error to specify a SKIP. If skip%   wieghts are desired, the weight matrix must be given explicitly%   (see below).%%   SOFTMAX(X, K, PRIOR, WEIGHTS, MASK) where WEIGHTS is a matrix%   similar to F.WEIGHTS described above, uses the connections and%   starting weights specified in the matrix. MASK is optional. If%   given MASK is a matrix the same size as weights consisting of all%   1's and 0's indicating which weights are to be optimised by the%   training algorithm. This allows the optimisation of weights that%   are initially 0 as well as the ability to keep some non-zero%   weights fixed.%%   SOFTMAX(X, K, PRIOR, MASK) is equivalent to SOFTMAX(X, K, PRIOR,%   WEIGHTS, MASK) where WEIGHTS are assigned randomly. If the initial%   random weights used by the training algorithm are needed, the MASK%   argument (or the NUNITS plus SKIP arguments) can be used with a%   value of 0 for MAXITER (see below).%%   By default, SOFTMAX uses no hidden units with skip weights which%   is functionally equivalent to a logistic discriminant analysis%   (see LOGDA). However, SOFTMAX will be much slower as the%   algorithm has been generalised for hidden units.%%   SOFTMAX(X, K, PRIOR, ..., DECAY) where DECAY is a positive scalar%   value less than 1 gives the weight decay for the model. The%   default decay is 0. DECAY forces the estimate of the residual%   deviance to be penalised by the magnitude of the estimated%   weights. Typical values range from .01 for a very large DECAY to a%   moderate value 10e-6. Because SOFTMAX initially normalises the%   inputs, this value is independent of the range of X. (However,%   SOFTMAX rescales the returned weights so that rescaling of input%   values is not necessary when classifying new data.)%%   SOFTMAX(X, K, PRIOR, ..., DECAY, MAXITER) where MAXITER is a%   positive integer aborts the algorithm after that many%   iterations. The default value is 200. If a value of 0 is given%   as MAXITER the algorithm terminates before optimising the%   connection weights. This is useful for returning a random%   matrix of weights which can be later manipulated before%   optimisation. However, if MAXITER is 0, a DECAY value must be%   given to avoid ambiguity in the arguments.%%   SOFTMAX(X, K, PRIOR, ..., MAXITER) is otherwise equivalent to%   supplying a DECAY of 0 (unless MAXITER is also 0---see above).%%   SOFTMAX(X, K, OPTS) allows optional arguments to be passed in the%   fields of the structure OPTS. Fields that are used by SOFTMAX are%   PRIOR, NUNITS, SKIPFLAG, WEIGHTS, MASK, DECAY, and%   MAXITER. However, neither NUNITS nor SKIP may be specified with%   either WEIGHTS or MASK.%%   [F, NITER, DEV, HESS] = SOFTMAX(X, k, ...) Additionally returns%   the number of iterations required by the algorithm before%   convergence in NITER, the residual deviance for the fit in DEV and%   the Hessian matrix of the weights in HESS. HESS is a square matrix%   where each row and column represents a single weight. The weights%   are ordered according to the vectorised weight matrix%   F.WEIGHTS(:);%%   SOFTMAX(X, G, ...) where G is a p by g matrix of posterior%   probabilities or counts, models this instead of absolute class%   memberships. If G represents counts, all of its values must be%   positive integers. Otherwise the rows of G represent posterior%   probabilities and must all sum to 1. It is an error to give the%   argument PRIOR in this case. If G represents posterior%   probabilities, F.PRIOR will be calculated as the normalised sum of%   the columns of G and F.COUNTS will be a scalar value representing%   the number of observations. Otherwise, F.COUNTS will be the sum of%   the columns and F.PRIOR will represent the observed prior%   distribution.%%   SOFTMAX(F) where F is an object of class LOGDA returns the%   SOFTMAX equivalent of the logistic discriminant analysis.%%   See also CLASSIFIER, LDA, QDA, LOGDA.%%   Notes:%   The argument structure can be rather complicated. The program%   tries to figure out which argument is which heuristically, but%   it's probably easy to defeat it. Arguments that are passed to%   SOFTMAX must be in the order described above although they may%   be entirely omitted allowing defaults to be used instead.%%   References:%   B. D. Ripley (1996) Pattern Classification and Neural%   Networks. Cambridge.%   Copyright (c) 1999 Michael Kiefte.%   $Id: softmax.m,v 1.1 1999/06/04 18:50:50 michael Exp $%   $Log: softmax.m,v $%   Revision 1.1  1999/06/04 18:50:50  michael%   Initial revision%if isa(X, 'logda')  error(nargchk(1, 1, nargin))  weights = [sparse(X.nvar+1, X.nvar+1) X.coefs'];  f = class(struct('weights', weights), 'softmax', X.classifier);    returnenderror(nargchk(2, 7, nargin))if nargin > 2 & isstruct(prior)  % using option structure  if nargin > 3    error(sprintf(['Cannot have arguments following option struct:\n' ...		   '%s'], nargchk(3, 3, 4)))      end  [prior nhid skip weights mask decay maxit] = ...      parseopt(prior, 'prior', 'nunits', 'skip', 'weights', 'mask', ...	       'decay', 'maxiter');    if (~isempty(nunits) | ~isempty(skip)) & (~isempty(weights) | ...					    ~isempty(mask))    error(['May not specify NUNITS or SKIPFLAG with either WEIGHTS' ...	   ' or MASK.'])  endelseif nargin < 3  prior = [];end[n p] = size(X);if prod(size(k)) ~= length(k)  % Multinomial incidence matrix or posterior probabilities  if length(varargin) > 4    error(sprintf(['Assuming second argument is an incidence matrix' ...		   ' of multinomial counts\nor posterior probabilities:' ...		   ' %s'], nargchk(0, 4, 5)))  end    [h G w] = classifier(X, k);  g = size(G, 2);  logG = G;  logG(find(G)) = log(G(find(G)));else  % Vector of class indeces  [h G] = classifier(X, k, prior);  nj = h.counts;  g = length(nj);  w = (nj./(n*h.prior))';  w = w(k);  logG = 0;end% Normalise inputs between (0, 1)range = h.range;X = (X - repmat(range(1,:), n, 1)) * diag(1./diff(range));trace = ~strcmp(warning, 'off');% varargin will be in this order:weights = [];mask = [];nhid = [];skip = [];decay = [];maxit = [];if length(varargin)  % all arguments are real doubles  if ~isempty(varargin{1}) & isa(varargin{1}, 'double') & ...	isreal(varargin{1})    if prod(size(varargin{1})) ~= length(varargin{1})      %specify weights as matrix      if length(varargin) >= 2 & ...	    all(size(varargin{2}) == size(varargin{1}))	%with mask matrix	if length(varargin) > 4	  error(sprintf(['Assuming fifth argument is MASK:' ...			 ' %s'], nargchk(2, 4, 5)))	  	end	varargin = [varargin(1:2), repmat({[]}, 1, 2), varargin(3:end)];      elseif all(nonzeros(varargin{1}) == 1)	%only mask matrix	if length(varargin) > 3	  error(sprintf(['Assuming fourth argument is MASK:' ...			 ' %s'], nargchk(1, 3, 4)))	end	varargin = [{[]}, varargin(1), repmat({[]}, 1, 2), ...		    varargin(2:end)];	      else	%without mask matrix	if length(varargin) > 3	  error(sprintf(['Assuming fourth argument is WEIGHTS:' ...			 ' %s'], nargchk(1, 3, 4)))	end	varargin = [varargin(1), repmat({[]}, 1, 3), ...		    varargin(2:end)];      end    elseif length(varargin{1}) > 1      % specify number of units in each hidden layer      if length(varargin) > 3	error(sprintf(['Assuming fourth argument is the number of' ...		       ' hidden units\nin each hidden layer:' ...		       ' %s'], nargchk(1, 3, 4)))      end      varargin = [repmat({[]}, 1, 2), varargin(1), {[]}, ...		  varargin(2:end)];    elseif round(varargin{1}) == varargin{1}      if length(varargin) >= 2 & isa(varargin{2}, 'double') & ...	    isreal(varargin{2}) & length(varargin{2}) == 1 & ...	    (varargin{2} == 1 | varargin{2} == 0)	% single hidden layer with skip flag	if length(varargin) > 4	  error(sprintf(['Assuming fifth argument is SKIPFLAG:\n' ...			 ' %s'], nargchk(2, 4, 5)))	end	varargin = [repmat({[]}, 1, 2), varargin];      else	% third argument is maximum number of iterations	if length(varargin) > 1	  error(sprintf(['Assuming fourth argument is MAXITER:' ...			 ' %s'], nargchk(1, 1, 2)))	end	varargin = [repmat({[]}, 1, 5), varargin];      end    else      % third argument is decay      if length(varargin) > 2	error(sprintf('Assuming fourth argument DECAY: %s', ...		      nargchk(1, 2, 3)))	        end      varargin = [repmat({[]}, 4, 1), varargin];    end  else    error('Can''t figure out what third argument should be.')  end      if length(varargin) == 5 & isa(varargin{5}, 'double') & ...	length(varargin{5}) == 1 & ...	round(varargin{5}) == varargin{5}    % maxiter in decay position      varargin(5:6) = [{[]}, varargin(5)];  end    if length(varargin) < 6    varargin{6} = [];  end    [weights mask nhid skip decay maxit] = deal(varargin{:});endif isempty(decay)  decay = 0;elseif ~isa(decay, 'double') | ~isreal(decay) | length(decay) ~= 1 | ...      decay < 0 | decay >= 1 | isnan(decay)  error('DECAY must be a positive scalar less than 1.')endif ~isempty(weights) | ~isempty(mask)  normw = 1;
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -