📄 userguide.tex

📁 神经网络的工具箱, 神经网络的工具箱,
💻 TEX
📖 第 1 页 / 共 3 页
字号:
上一页 1 23
as cost function. To inspect, you use \tbcmd{get_wcfnet_a_snn}.For example,\begin{example}net = net_struct_snn([1 8 2], {'tansigtf_snn' 'lintf_snn'})get_wcfnet_a_snn(net)net = set_wcfnet_a_snn(net, [1; 2]);get_wcfnet_a_snn(net)\end{example}\subsubsection{Output variable error functions $e_i$}By default, the output variable error functions parameter is set to\tbcmd{se_snn}, the squared error $(y - t)^2$. This parameter can be set to either a string or a cell matrix ofstrings. When itis set to a string, \tbcmd{wcf_snn} uses this string as the errorfunction for all output variables. The cell matrix must be columnmatrix with for each output variable an element containing an errorfunction. To access this parameter on an already created network structure youcan use \tbcmd{get_wcfnet_e_snn} and \tbcmd{set_wcfnet_e_snn}.For example,\begin{example}net = net_struct_snn([1 8 2], {'tansigtf_snn' 'lintf_snn'})get_wcfnet_e_snn(net)net = set_wcfnet_e_snn(net, {'se_snn'; 'loglikelihood_snn'});get_wcfnet_e_snn(net)\end{example}Possible error function are the squared error\tbcmd{se_snn}, the relative error \tbcmd{relerr_snn}, the crossentropy error \tbcmd{crossentropy_snn}, the cross logistic error\tbcmd{crosslogistic_snn} or the log likelihood error\tbcmd{loglikelihood_snn}. Also, it is relatively easy to write yourown error function.\subsubsection{Pattern weights $g_{\mu}$}To set and get the pattern weights, you should use\tbcmd{set_wcfdata_gmu_snn} and \tbcmd{get_wcfdata_gmu_snn} on awcfdata structure. For example, \begin{example}P = rand(2,5); T = rand(1,5); wcfdata = wcfdata_struct_snn(P, T)get_wcfdata_gmu_snn(wcfdata)gmu = [1 2 2 1 1];wcfdata = set_wcfdata_gmu_snn(wcfdata, gmu)\end{example}\subsubsection{Mask $\Delta_{i\mu}$}To set and get the mask $\Delta$ you must use \tbcmd{set_wcfdata_delta_snn} and \tbcmd{get_wcfdata_delta_snn}. Note that in a wcfdata structure, the parameter $\Delta$ is saved in thefield \field{useT} instead of a more logical field like\field{Delta}. This is a consequence of a change in design of thetoolbox. You should not access the fields in a wcfdata structuredirectly and only use the access function mentioned above.     For example,\begin{example}P = rand(2,5); T = rand(1,5); wcfdata = wcfdata_struct_snn(P, T)get_wcfdata_delta_snn(wcfdata)Delta = [1 0 0 1 1];wcfdata = set_wcfdata_delta_snn(wcfdata, Delta)\end{example}\section{Training algorithms}As training algorithm you can choose from gradient descent\tbcmd{traingd_snn}, conjugate gradient with Polak-Ribi\'ere update\tbcmd{traincgp_snn} and Levenberg-Marquardt \tbcmd{trainlm_snn}. When you create a network with \tbcmd{net_struct_snn}, all informationon the training algorithm is stored in the field \field{trainFcn} ofthe network structure.\subsection{Gradient descent}In the gradient descent algorithm, the network parameters $\vw$ areadjusted iteratively with an amount $\vdw$, where $\vdw$ is given by\begin{equation}\vdw = - \alpha \gradE\end{equation}In this equation $\alpha$ is the learning rate. This $\alpha$ has adefault value of $0.001$ for a network created with\tbcmd{net_struct_snn}, but can be changed by setting the subfield\field{lr}.  Training is stopped when one of the stopping criteria is met. Fornormal training, these stopping criteria are:\begin{itemize}\item Maximum number of iterations. \\ To change the maximum number ofiterations in a training run, you change the subfield \field{epochs}.\item Total training time for a training run. \\ To set the maximumtime for a training run, set the subfield \field{time}.\item Cost function goal. \\ When the cost function becomes smallerthen this, training stops. The goal is set in the subfield\field{goal}.     \end{itemize}For example, training a network created with\begin{example}net = net_struct_snn([11 5 6], {'tansigtf_snn' 'lintf_snn'}, ...                     'traingd_snn')net.lr = 0.5e-3;net.trainFcn.epochs = 100;net.trainFcn.time = 1000;net.trainFcn.goal = 0.1;\end{example}will stop when 100 iterations are done, when the total training timeis 1000 seconds, or when the cost function $E(\vecay, \vecat) < 0.1$.For training with early stopping, training is also stopped when thecost function as computed on the validation set increased in\field{max\_fail} iterations, or in \field{max\_suc\_fail} successiveiterations. These values can be set as well.   \begin{example}net.trainFcn.max_fail = 20;net.trainFcn.max_suc_fail = 10;\end{example}\subsection{Levenberg-Marquardt}In the Levenberg-Marquardt algorithm, the network parameters areupdated according to\begin{equation}\vdw = - [F + \mu I]^{-1} \gradE\end{equation}where $F$ is the Fisher matrix, $I$ is the identity matrix and $\mu$is a parameter that is adjusted during training. To change the initialvalue for $\mu$, set the contents of the subfield \field{mu}. When anupdate with this $\mu$ would increase the cost function, $\mu$ isincreased with a factor specified in the subfield \field{mu\_inc}. Ifthe update decreases the cost function, $\mu$ is decreased with afactor \field{mu\_dec}.     Again training is stopped when the stopping criteria are met. Thestopping criteria for gradient descent also apply to this algorithm.Furthermore, training is stopped when $\mu$ reaches a maximum value asspecified in \field{mu\_max}. Normally there is no reason to changethe standard settings.\subsection{Conjugate Gradient}In conjugate gradient algorithms, the network parameters areadjusted in a direction that not only depends on the gradient, butalso on the direction of previous adjustments. There are variousversions of conjugate gradient algorithms. The function\tbcmd{traincgp_snn} implements the algorithm with Polak-Ribi\'ereupdate. There are no parameters to be set other than the stoppingcriteria as in gradient descent.       \section{Train network} When a network structure has been created and the training and costfunction parameters are set, training can be started with thecommand \tbcmd{train_snn}. The first argument of this function contains the network and the second argument the training data. For example, \begin{example}MU = 50; x = 3*randn(1, MU); wcfdata = wcfdata_struct_snn(x, sin(x) + 0.1*randn(1, MU));net = net_struct_snn([1 5 1], {'tansigtf_snn' 'lintf_snn'});[trained_net, tr_info] = train_snn(net, wcfdata)\end{example}Optionally, a third argument to \tbcmd{train_snn} can bespecified containing validation data which is used to early stop training.\begin{example}x = randn(1,MU); vl_data = wcfdata_struct_snn(x, sin(x) + 0.1*randn(1, MU));[trained_net, tr_info] = train_snn(net, wcfdata, vl_data)\end{example}Alternatively, instead of specifying a separate validation set, youcan use \tbcmd{train_bootstrap_snn} or \tbcmd{train_halfout_snn}, which    divide their input data randomly in a training and validation set.\begin{example}[trained_net, tr_info, dataset] = train_bootstrap_snn(net, wcfdata)\end{example}\section{The network as an estimator}After training, the network can be used as an estimator. The command\tbcmd{simff_snn} takes as arguments the trained network and a matrix with ineach column an input pattern and returns a matrix with in each columnan estimate for the corresponding output.For example,\begin{example}P = [-6:0.05:6]y = simff_snn(trained_net, P)plot(P,y)\end{example}\section{Ensemble averaging}An ensemble of networks can be created by training   networks on different subsets of the training data set and by trainingwith different initial $\vw$. New initial $\vw$ can be set with the command \tbcmd{reinitweights_snn}. With \tbcmd{train_bootstrap_snn} or \tbcmd{train_halfout_snn}, training is done on different subsets.For example to create an ensemble of 10 networks use\begin{example}for m = 1:10    net = reinitweights_snn(net);    [nets(m), tr_info(m), datasets(m)] = ...                    train_bootstrap_snn(net, wcfdata)end\end{example}The estimates of the networks in the ensemble can be combined inseveral ways to give an averaged estimate. The function\tbcmd{simff_avr_snn} computes a weighted average over the outputs ofthe networks in an ensemble. The weights in the weighted average arean input for this function.  For example, in \emph{bagging} we weigh all networks in the ensembleequally. To compute the bagged estimate of an ensemble we thus use\begin{example}network_weights = 1/10 * ones(1,10); y = simff_snn(nets, network_weights, P)\end{example}To use a technique called \emph{balancing} we compute the weight of each network with \tbcmd{balance_snn}. This function returns an estimate ofthe optimal network weights. To compute this estimate, anotherbootstrap procedure is applied. The number of bootstrap samples tocompute the estimate must be specified as an input to this function. Also, this function needs information about the specific training andvalidation sets the networks in the ensemble were trained on. This isreturned by the functions \tbcmd{train_bootstrap_snn} and\tbcmd{train_halfout_snn}.For example,\begin{example}bootn = 100;network_weights = balance_snn(nets, bootn, datasets)\end{example}Then, to compute for each input pattern a balanced estimate, againwe use \tbcmd{simff_avr_snn}.For example,\begin{example}y_av = simff_avr_snn(nets, network_weights, P)\end{example}\section{Estimate confidence intervals}Confidence intervals can be calculated with \tbcmd{confidence_snn},which makes an estimate of the lower and upper bound of the interval based on the weighted averaged error and$c_{\mbox{\scriptsize confidence}}$. The value of$c_{\mbox{\scriptsize confidence}}$ depends on thedesired confidence level 1 - $\alpha$, where $\alpha$ is the error level, andcan be calculated with \tbcmd{c_confidence_snn}.For example,\begin{example}error_level = 0.33         % standard errorc_confidence = c_confidence_snn(...    error_level, nets, wcfdata, network_weights)[ylc, yuc, y_av] = ...    confidence_snn(c_confidence, nets, network_weights, P)\end{example}\section{Estimate prediction intervals}Prediction intervals are calculated with the command\tbcmd{prediction_snn}, which makes an estimate of the lower and upperbound of the interval based on $c_{\mbox{\scriptsize prediction}}$ and a (new) networkwhich gives a prediction of the output noise. This network and$c_{\mbox{\scriptsize prediction}}$ are calculated with \tbcmd{c_prediction_snn}.For example,\begin{example}[c_prediction, noise_net, tr_info] = ...   c_prediction_snn(error_level, nets, datasets, network_weights)[ylp, yup, y_av] = ...   prediction_snn(c_prediction, noise_net, nets, network_weights, P)\end{example}\section{Input relevance}To determine the order of relevance of inputs, you use\tbcmd{input_relevance_snn}. This function takes a network andtraining data and returns the remaining explained variance afterremoval of inputs and the inputs in order of removal. For example.\begin{example}[explained_variance, input_indices] = ...    input_relevance_snn(net, wcfdata)\end{example}\bibliography{/home/snn/tom/tex/referenties}\bibliographystyle{unsrt}%---------------------------------------------------------------------\end{document}
上一页 1 23
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -