⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 sample_problem.m

📁 approximate reinforcement learning
💻 M
字号:
function out = sample_problem(mode)% Template for an approximate RL problem setup function.%   OUT = SAMPLE_PROBLEM(MODE)%% Parameters:%   mode        - specifies mode information about the problem setup should be returned% Returns:%   OUT         - depending on the value of mode, this should be as follows:%       - MODE = 'model', the model structure. This mode must is mandatory for every model.%         Mandatory fields:%           fun     - the mdp function (dynamics and rewards) of the process%           plotfun - is called after each replay with the history, if specified%           Ts      - discretization step%         Typically, also contains the fields:%           phys    - physical parameters%           disc    - discretization configuration%           goal    - goal configuration%       - MODE = 'fuzzy', a structure containing settings for fuzzy Q-iteration. Required fields:%           xgrids  - cell array of grids for state quantization%           ugrids	- cell array of grids for action quantization%       - MODE = 'tiling', a structure containing the configuration for tile-coding%                   approximation. Required fields:%           xgrids  - cell array of grids for state quantization%           ugrids	- cell array of grids for action quantization%           tcfg    - with fields as required by tilingqi() -- check that function for details%% Note the code below is not 'working' but is provided for illustrative purposes onlyswitch mode,        % The 'model' mode is required    case 'model',                % these are required        model.Ts = 0.1;                     % sample time        model.fun = 'sample_mdp';           % MDP function        % these are optional        model.plotfun = 'sample_plot';      % if you are defining a custom plot function                % from here on, define parameters as required by your model        % various physical parameters in the dynamics        model.phys.a = 10;                   % discretization        model.disc.fun = 'ode45';        % reward config        mode.goal.region = 1;                out = mode;            % The 'fuzzy' mode is optional, but fuzzy Q-iteration won't work without it    case 'fuzzy',        % these are required        cfg.xgrids = {-10:10, -5:5};        % quantization grid for the states        cfg.ugrids = -1:1;                  % and for the actions        % also specify here any values on the config that should override the default        % fuzzy learning config        % e.g., Q-iteration parameters        cfg.maxiter = 1000;        cfg.gamma = 0.9;        cfg.eps = 1e-3;        % e.g., initial state and end time for replay        cfg.x0 = [1 0]';        cfg.tend = 10;                out = cfg;            % The 'tiling' mode is optional, but tile-coding Q-iteration won't work without it    case 'tiling',        % these are required        cfg.xgrids = {-10:10, -5:5};        % quantization grid for the states        cfg.ugrids = -1:1;                  % and for the actions        cfg.tcfg.c = 2;                     % number of tilings        cfg.tcfg.init = 0;                  % how to init tile values        cfg.tcfg.exact = 0;                 % whether one tiling should exactly fall onto the grid        cfg.tcfg.delta = [0.5 1];           % tiling max displacements along the dimensions        % also specify here any values on the config that should override the default        % fuzzy learning config        % e.g., Q-iteration parameters        cfg.maxiter = 500;        cfg.gamma = 0.9;        cfg.eps = 1e-1;        % e.g., initial state and end time for replay        cfg.x0 = [1 0]';        cfg.tend = 10;                out = cfg;        end;        % mode SWITCH% END sample_problem() RETURNING out ===================================================

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -