⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 value_iteration.m

📁 实现马尔可夫决策过程模型的算法
💻 M
字号:
function [V, Q, iter] = value_iteration(T, R, discount_factor, oldV)% Solve Bellman's equation iteratively.% [V, Q, niter] = value_iteration(T, R, discount_factor, oldV)% oldV is an optional staring point.S = size(T,1);if nargin<4  % set initial value to R  oldV = max(R,[],2);end done = 0;% We stop iterating if max |V(i) - oldV(i)| < thresh.% This will yield a policy loss of no more than 2eg/(1-g),% where e=thresh and g=discount_factor.thresh = 1e-4;iter = 1;while ~done  iter = iter + 1;  Q = Q_from_V(oldV, T, R, discount_factor);  V = max(Q,[],2);  if approxeq(V, oldV, thresh), done = 1; end  oldV = V;end

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -