sample_mdp.m

来自「approximate reinforcement learning」· M 代码 · 共 20 行

20 行

function [xplus, rplus] = sample_mdp(m, x, u)%  Implements the discrete-time dynamics of the Markov decision process.%  [XPLUS, RPLUS] = DOUBLEINT_MDP(M, X, U)%  Parameters:%   M   - the model specification. Typically contains the fields (all structures)%           phys - physical parameters%           disc - discretization configuration%           goal - goal configuration%       but the actual structure may depend on the particular MDP.%   X   - current state, x(k)%   U 	- command u(k)%  Returns:%   XPLUS       - state at next sample, x(k+1)%   RPLUS       - ensuing reward, r(k+1)% compute here the next state and rewardxplus = 0 * m.phys.a * x + 0 * u;rplus = 0;% END sample_mdp() RETURNING xplus, rplus ===============================================

sample_mdp.m - 源码说明

本页面展示了「approximate reinforcement learning」中的 sample_mdp.m 源码文件，采用 M 编程语言编写，共 20 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。

虫虫开发者社区收录了大量与强化学习相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。

⌨️ 快捷键说明

复制代码Ctrl + C

搜索代码Ctrl + F

全屏模式F11

增大字号Ctrl + =

减小字号Ctrl + -

显示快捷键?