📄 runfunction.txt
字号:
void run(MainParameters& mainP, StateActionFA* safa, Agent* agent, bool saveFA=true)
Synopsys: #include "interface_classes.h
#include "main_init.h"
void run(MainParameters& mainP, StateActionFA* safa, Agent* agent, bool saveFA=true);
Link: learning_run.cpp
sarepr.cpp
environment.cpp
random_numbers.cpp
agent.cpp
safa.cpp
Functionality:
This function implements a learning run in an RL system. The file named r_n.hst is produced that contains learning history, where n is the number of the run, as specified in mainP.run data member. It is stored in the directory, specified in mainP.dir data member. The meaning of the columns in r_n.hst file is the following:
Trial#
Return
Average return per step (Return/Length of the trial)
Maximum parameter change (since previous time of record) in approximator for action 0
Number of parameters affected by learning (since the beginning of learning) for action 0
... the last two colums repeat for each action.
At the end of the run, the parameters of the function approximator for each action are saved to a text file, if saveFA==true. The name of the files are r_n.am, where m is the action's ordinal number in the action set.
The run function expects a text file with test start states to be available (the name of this file is specified in the mainP.TestStatesFile data member). In that file, each test state is specified on a separate line with blank space separating values of each state variable. The function loads these test states and uses them as start states to evaluate policies during learning. The value of the policy then represents the average of the values of these states.
Arguments:
mainP
contains several parameters, such as the number of trials, frequency of policy evaluations, etc. See documentation for MainParameters structutre for detailed information. Data members of this structure can be assigned values in a caller function by running MainParameters::process() function which processes command-line arguments passed to main().
safa
Pointer to the initialized (learning parameters set, such as learning step, etc.) function approximators representing action-value functions or randomized policy for all actions.
agent
Pointer to the initialized (learning parameters set) Agent object.
saveFA
Optional argument, indicates whether the function approximators' parameters have to be saved to text files at the end of the run. If not specified on the function call, it is true by default.
--------------------------------------------------------------------------------
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -