sample_problem.m

来自「approximate reinforcement learning」· M 代码 · 共 92 行

92 行

function out = sample_problem(mode)% Template for an approximate RL problem setup function.%   OUT = SAMPLE_PROBLEM(MODE)%% Parameters:%   mode        - specifies mode information about the problem setup should be returned% Returns:%   OUT         - depending on the value of mode, this should be as follows:%       - MODE = 'model', the model structure. This mode must is mandatory for every model.%         Mandatory fields:%           fun     - the mdp function (dynamics and rewards) of the process%           plotfun - is called after each replay with the history, if specified%           Ts      - discretization step%         Typically, also contains the fields:%           phys    - physical parameters%           disc    - discretization configuration%           goal    - goal configuration%       - MODE = 'fuzzy', a structure containing settings for fuzzy Q-iteration. Required fields:%           xgrids  - cell array of grids for state quantization%           ugrids	- cell array of grids for action quantization%       - MODE = 'tiling', a structure containing the configuration for tile-coding%                   approximation. Required fields:%           xgrids  - cell array of grids for state quantization%           ugrids	- cell array of grids for action quantization%           tcfg    - with fields as required by tilingqi() -- check that function for details%% Note the code below is not 'working' but is provided for illustrative purposes onlyswitch mode,        % The 'model' mode is required    case 'model',                % these are required        model.Ts = 0.1;                     % sample time        model.fun = 'sample_mdp';           % MDP function        % these are optional        model.plotfun = 'sample_plot';      % if you are defining a custom plot function                % from here on, define parameters as required by your model        % various physical parameters in the dynamics        model.phys.a = 10;                   % discretization        model.disc.fun = 'ode45';        % reward config        mode.goal.region = 1;                out = mode;            % The 'fuzzy' mode is optional, but fuzzy Q-iteration won't work without it    case 'fuzzy',        % these are required        cfg.xgrids = {-10:10, -5:5};        % quantization grid for the states        cfg.ugrids = -1:1;                  % and for the actions        % also specify here any values on the config that should override the default        % fuzzy learning config        % e.g., Q-iteration parameters        cfg.maxiter = 1000;        cfg.gamma = 0.9;        cfg.eps = 1e-3;        % e.g., initial state and end time for replay        cfg.x0 = [1 0]';        cfg.tend = 10;                out = cfg;            % The 'tiling' mode is optional, but tile-coding Q-iteration won't work without it    case 'tiling',        % these are required        cfg.xgrids = {-10:10, -5:5};        % quantization grid for the states        cfg.ugrids = -1:1;                  % and for the actions        cfg.tcfg.c = 2;                     % number of tilings        cfg.tcfg.init = 0;                  % how to init tile values        cfg.tcfg.exact = 0;                 % whether one tiling should exactly fall onto the grid        cfg.tcfg.delta = [0.5 1];           % tiling max displacements along the dimensions        % also specify here any values on the config that should override the default        % fuzzy learning config        % e.g., Q-iteration parameters        cfg.maxiter = 500;        cfg.gamma = 0.9;        cfg.eps = 1e-1;        % e.g., initial state and end time for replay        cfg.x0 = [1 0]';        cfg.tend = 10;                out = cfg;        end;        % mode SWITCH% END sample_problem() RETURNING out ===================================================

sample_problem.m - 源码说明

本页面展示了「approximate reinforcement learning」中的 sample_problem.m 源码文件，采用 M 编程语言编写，共 92 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫下载站收录了大量与reinforcement相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?