⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 cvfunction.h

📁 强化学习算法(R-Learning)难得的珍贵资料
💻 H
📖 第 1 页 / 共 2 页
字号:

	virtual void getGradient(CStateCollection *state, CFeatureList *gradientFeatures) = 0;

	virtual void resetData() = 0;
	virtual void loadData(FILE *stream) {CGradientUpdateFunction::loadData(stream);};
	virtual void saveData(FILE *stream) {CGradientUpdateFunction::saveData(stream);};

	virtual CAbstractVETraces *getStandardETraces();

};


class CGradientDelayedUpdateVFunction : public CGradientVFunction, public CGradientDelayedUpdateFunction
{
protected:
	virtual void updateWeights(CFeatureList *dParams) {CGradientDelayedUpdateFunction::updateWeights(dParams);};

	CGradientVFunction *vFunction;
public:
	/// constructor, the properties are needed to fetch the state from the state collection.
	CGradientDelayedUpdateVFunction(CGradientVFunction *vFunction);
	virtual ~CGradientDelayedUpdateVFunction() {};

	virtual rlt_real getValue(CState *state);
	virtual void getGradient(CStateCollection *state, CFeatureList *gradientFeatures);

	virtual void resetData() {CGradientDelayedUpdateFunction::resetData();};

	///  Returns the number of weights.
	virtual int getNumWeights(){return CGradientDelayedUpdateFunction::getNumWeights();};

	virtual void getWeights(rlt_real *parameters) {CGradientDelayedUpdateFunction::getWeights(parameters);};
	virtual void setWeights(rlt_real *parameters) {CGradientDelayedUpdateFunction::setWeights(parameters);};

	virtual void loadData(FILE *stream) {CGradientVFunction::loadData(stream);};
	virtual void saveData(FILE *stream) {CGradientVFunction::saveData(stream);};

};

/// Interface class for calculating the gradient dV(x)/dx
/** 
Interface for calcualting the input derivation of a feature function. The input derivation is calculated in the function getInputDerivation and written in the given targetVector, which always has the dimension of the model state (only for continuous state variables).
\par
By now there is only the numeric input derivation calculator, calculating the derivation analytically is supported by feature v-functions and torch-vfunctions but its not tested, so its recommended to use the numeric derivation.
*/
class CVFunctionInputDerivationCalculator : virtual public CParameterObject
{
protected:
	CStateProperties *modelState;
public:
	CVFunctionInputDerivationCalculator(CStateProperties *modelState);

	virtual void getInputDerivation( CStateCollection *state, CMyVector *targetVector) = 0;
	int getNumInputs();
};


/// Calculating the input derivation of a V-Function numerically
/** 
The derivation is calculated by the three point rule for each input state variable, so the formular
$f'(x) = (f(x + stepSize) - f(x - stepSize))/ 2 * stepSize$ is used, stepSize is set in the constructor and also can be set by the Parameter "NumericInputDerivationStepSize". For each input state variable the stepsize is scaled with the size of the intervall of the state variable, so the "NumericInputDerivationStepSize" parameter is given in percent, and not an absolute value.
\par
The class "CVFunctionNumericInputDerivationCalculator" has the following Parameters:
- "NumericInputDerivationStepSize": stepSize of the numeric differentation.
*/
class CVFunctionNumericInputDerivationCalculator : public CVFunctionInputDerivationCalculator
{
protected:
	CAbstractVFunction *vFunction;
	CStateCollectionImpl *stateBuffer;
public:
	CVFunctionNumericInputDerivationCalculator(CStateProperties *modelState, CAbstractVFunction *vFunction, rlt_real stepSize, std::list<CStateModifier *> *modifiers);
	virtual ~CVFunctionNumericInputDerivationCalculator();

	virtual void getInputDerivation( CStateCollection *state, CMyVector *targetVector);
};



/**
Feature Functions can be used as linear approximators, like tile-coding and RBF-networks, the exact usage of a feature function depends on the feature state it uses.
Feature states are a very common possibility to discretize continuous states. A feature consists of its index and a feature activation factor. The sum of all these factors should sum up to 1.
<p> 
Feature value function are modeled by the class CFeatureVFunction. For each feature it stores a feature value, so its just a table of features, 
the only difference to the tabular case is the calculation of the values (the sum of feature value * feature factor). CFeatureVFunctions are supposed to
get a feature calculator or discretizer as state properties object. With this state property object the value function is able retrieve its feature state from the state collections. 
CFeatureVFunction inherits all access functions for the features from CFeatureFunction. Additionaly it implements the 
functions for setting and getting the values with states. These functions decompose the feature state into its discrete states and call
the companion pieces of the functions for features (e.g. integer values instead of states) and multiply the values by the feature factors.
<p>
To create a feature value function you have to pass a state properties object to the constructor. The number of features for the function is calculated by the discrete state size of the 
properties object. Of course the properties object is passed to the the super calss CAbstractVFunction and is used to retrieve the 
wanted state from the state collection. 
You also have an additional contructor at yours disposal, which can be used to calculate the value function
from a stochastic policy given a Q-Function. The Values of the features are calculated as the expectation of the action propabilities multiplied with the
Q-Values. 
<p>
The standard VETraces object of feature value functions are CVFeatureETraces, which store the features in a feature list. 

*/
class CFeatureVFunction : public CGradientVFunction, public CFeatureFunction
{
protected:


public:
	/// Creates a feature v-function with the specific feature state.
/**The number of features for the function is calculated by the discrete state size of the 
properties object. Of course the properties object is passed to the the super calss CAbstractVFunction and is used to retrieve the 
wanted state from the state collection. The properties are supposed to be from a feature or a discrete state.*/

	CFeatureVFunction(CStateProperties *featureFact);

/**Can be used to calculate the value function
from a stochastic policy given a Q-Function. The Values of the features are calculated as the expectation of the action propabilities multiplied with the
Q-Values. The constrcutor calls setVFunctionFromQFunction to do this. The state properties and so the number of features are taken from the Q-Function.
*/
	CFeatureVFunction(CFeatureQFunction *qfunction, CStochasticPolicy *policy);
	
	~CFeatureVFunction();
/**Can be used to calculate the value function
from a stochastic policy given a Q-Function. The Values of the features are calculated as the expectation of the action propabilities multiplied with the
Q-Values*/
	virtual void setVFunctionFromQFunction(CFeatureQFunction *qfunction, CStochasticPolicy *policy);

	virtual void updateWeights(CFeatureList *gradientFeatures);


/// Updates the value function given a feature or discrete state
/** Decomposes the feature state in its discrete state variable and adds the "td" value to the values of the features.
Each update is multiplied with the coresponding feature factor.
*/
	virtual void updateValue(CState *state, rlt_real td);
/// Sets the value given a feature or discrete state
/** Decomposes the feature state in its discrete state variable and sets the values of the features to the "qValue" value.
Each value is multiplied with the coresponding feature factor.
*/
	virtual void setValue(CState *state, rlt_real qValue);
/// Returns the value given a feature or discrete state
/** Decomposes the feature state in its discrete state variable and calculates the value by summing up the feature values
of the aktiv features multiplied with their factor.
*/
	virtual rlt_real getValue(CState *state);

/// Calls the saveFeatures function of CFeatureFunction
	virtual void saveData(FILE *file);
/// Calls the loadFeatures function of CFeatureFunction
	virtual void loadData(FILE *file);
	virtual void printValues();

/// Returns a new CFeatureVETraces object
	virtual CAbstractVETraces *getStandardETraces();

	virtual void getGradient(CStateCollection *state, CFeatureList *gradientFeatures);


	virtual int getNumWeights();

	virtual void resetData();


	virtual void getWeights(rlt_real *parameters);
	virtual void setWeights(rlt_real *parameters);

};

class CFeatureVFunctionInputDerivationCalculator : public CVFunctionInputDerivationCalculator
{
protected:
	CFeatureVFunction *vFunction;
	CMyVector *featureInputDerivation;
	CMyVector *fiDerivationSum;

public:
	bool normalizedFeatures; 

	CFeatureVFunctionInputDerivationCalculator(CStateProperties *inputState, CFeatureVFunction *vFunction);
	~CFeatureVFunctionInputDerivationCalculator();

	virtual void getInputDerivation( CStateCollection *state, CMyVector *targetVector);
};


/// Value Function as a table
/*
Tables are just the same as feature functions. The only difference is the kind of states for the value-table, it uses discrete states. The class CVTable represents tabular value functions, 
the class is subclass of CFeatureVFunction. Value-Tables can only be used with CAbstractStateDiscretizer.
*/
class CVTable : public CFeatureVFunction
{
public:
	CVTable(CAbstractStateDiscretizer *state);
	
	~CVTable();
		
	void setDiscretizer(CAbstractStateDiscretizer *discretizer);
	CAbstractStateDiscretizer *getDiscretizer();

	int getNumStates();
};


#endif

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -