⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 tiling_update.m

📁 approximate reinforcement learning
💻 M
字号:
function [v, i, j, wdata] = tiling_update(xdata, wdata, tdim, x, v, replace, i)
% TILING_UPDATE Updates tile weights corresponding to input x with value v.
%   [V, I, J, WDATA] = TILING_UPDATE(XDATA, WDATA, TDIM, X, V, REPLACE, I)
%
% This function supports several operating modes, switched by the input and output arguments
% configurations.
% 1. If wdata is required on the output, then it is updated and returned (1a). If wdata is not
% required on the output, v will contain a matrix of values with which wdata should be
% updated, i will contain the tile indices, and j the discrete indices which should be updated
% (1b).
% 2. If x contains a discrete vector (2a), then only the tile weights corresponding to that vector
% are updated (returned for updating in v if mode 1b above). If x does not contain a discrete
% vector (2b), then the tile weights of all the discrete combinations are updated (returned for
% updating in v is mode 1b above). In mode 2b, input v must contain values to update for all
% discrete combinations, either in matrix or flat vector format.
%
% In modes 1b,2*, the caller context assumes then responsibility for performing the actual update.
% This should be done as follows:
%   wdata(i, j) = v
%
% Modes 1b,2* are more efficient than 1a,2*, since copies of the typically (very) large array wdata 
% need not be made.
%
% Parameters:
%   xdata       - tile grid data, cell array of length d
%   wdata       - tiling weight data
%   tdim        - tiling dimensions structure
%   x           - the input vector. I can contain only the continuous components, in
%       which case the values of all the discrete inputs are returned in an array. It can
%       contain the complete input, in which case the value of the given discrete input are
%       returned in a scalar.
%   v           - the value with which the tiling should be updated
%       (or array of values for all discrete combinations, depending on x)
%   replace     - if the new values should replace the stored values instead of being added to
%       them. Optional, default is 0, i.e., the new value should be added to the already
%       existing values.
%   i           - (optional) indices of tiles fired by the input.
%       If indices are already available, supply them for speed. If these indices are
%       supplied, the values of the continuous inputs is ignored.
% Returns:
%   v           - the updated value of the input in the tiling, or an array of updated values
%       for the fired tiles, depending on whether wdata is present or not. If wdata is
%       present, a properly sized v array is returned, following the sizes of the discrete
%       inputs.
%   i           - indices of the tiles fired by the input
%   j           - indices of discrete inputs that were/are to be updated. 1 if no discrete
%       inputs exist.
%   wdata       - (optional) the updated weight data

% ==== Process arguments, compute indices if not given =====

if nargin < 7,
    % we don't have tile indices
    % construct cdim+1 dimensional indices for the tiles in the c tilings
    ii = zeros(tdim.c, tdim.cdim + 1);
    ii(:, 1) = 1:tdim.c;       % 1:c on the first column
    
    % for each input component,
    for di = 1:tdim.cdim,        
        % compute the indices of fired tiles in its tilings
        ii(:, di+1) = sum(xdata{di} <= x(di), 2);
    end;
    % to flat
    i = ndi2lin(ii, tdim.tsize);
end;


% ==== Do the actual update =====

if isempty(tdim.dsize),              % dummy discrete index if no discrete components
    j = 1;
else
    xd = x(tdim.cdim + 1:end);       % discrete inputs (identified with indices), if any
    if isempty(xd),                  % no discrete inputs -- need to update matrix of discrete combinations
        v = reshape(v, 1, []);          % v must be a row vector
        j = 1:prod(tdim.dsize);
    else j = ndi2lin(xd, tdim.dsize);
    end;
end;

% distribute update value over tiles
v = repmat(v, tdim.c, 1);
% v = repmat(v / tdim.c, tdim.c, 1);
% add the old values if not replacing (replace = 0 or not specified)
if nargin < 6 || ~replace,
    v = wdata(i, j) + v;
end;

if nargout >= 4,        % we need to alter wdata itself
    wdata(i, j) = v;
    v = sum(v);
    % reshape in nice size if we return all discrete combinations, and there is more than one
    % discrete dimension
    if length(j) > 1 && tdim.ddim > 1,
        v = reshape(v, tdim.dsize);
    end;
end;

% otherwise, we just return the computed v

% === END tiling_update, RETURNING v, i, j and possibly wdata =================================

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -