📄 value_determination.m
字号:
function V = value_determination(p, T, R, discount_factor)% VALUE_DETERMINATION Solve Bellman's equation for a fixed policy % V = value_determination(p, T, R, discount_factor)S = size(T,1);A = size(T,2);% Extract the part of T and R which is specific to this policyTp = zeros(S,S); % Tp(s,s') = T(s, p(s), s')Rp = zeros(S,1); % Rp(s) = R(s, p(s))for a=1:A % avoid looping over S ind = find(p==a); % the rows that use action a if ~isempty(ind) Tp(ind,:) = reshape(T(ind,a,:), length(ind), S); Rp(ind) = R(ind,a); endend% V = R + gTV => (I-gT)V = R => V = inv(I-gT)*RV = (eye(S) - discount_factor*Tp) \ Rp;%V = pinv(eye(S) - discount_factor*Tp) * Rp;
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -