📄 readme
字号:
palloc() their result space explicitly. I recommend naming the GETARG andRETURN macros for such types to end in "_P", as a reminder that theyproduce or take a pointer. For example, PG_GETARG_TEXT_P yields "text *".When a function needs to access fcinfo->flinfo or one of the other auxiliaryfields of FunctionCallInfo, it should just do it. I doubt that providingsyntactic-sugar macros for these cases is useful.Call-site coding conventions----------------------------There are many places in the system that call either a specific function(for example, the parser invokes "textin" by name in places) or aparticular group of functions that have a common argument list (forexample, the optimizer invokes selectivity estimation functions witha fixed argument list). These places will need to change, but we shouldtry to avoid making them significantly uglier than before.Places that invoke an arbitrary function with an arbitrary argument listcan simply be changed to fill a FunctionCallInfoData structure directly;that'll be no worse and possibly cleaner than what they do now.When invoking a specific built-in function by name, we have generallyjust written something like result = textin ( ... args ... )which will not work after textin() is converted to the new call style.I suggest that code like this be converted to use "helper" functionsthat will create and fill in a FunctionCallInfoData struct. Forexample, if textin is being called with one argument, it'd looksomething like result = DirectFunctionCall1(textin, PointerGetDatum(argument));These helper routines will have declarations like Datum DirectFunctionCall2(PGFunction func, Datum arg1, Datum arg2);Note it will be the caller's responsibility to convert to and fromDatum; appropriate conversion macros should be used.The DirectFunctionCallN routines will not bother to fill infcinfo->flinfo (indeed cannot, since they have no idea about an OID forthe target function); they will just set it NULL. This is unlikely tobother any built-in function that could be called this way. Note alsothat this style of coding cannot pass a NULL input value nor cope witha NULL result (it couldn't before, either!). We can make the helperroutines ereport an error if they see that the function returns a NULL.When invoking a function that has a known argument signature, we haveusually written either result = fmgr(targetfuncOid, ... args ... );or result = fmgr_ptr(FmgrInfo *finfo, ... args ... );depending on whether an FmgrInfo lookup has been done yet or not.This kind of code can be recast using helper routines, in the samestyle as above: result = OidFunctionCall1(funcOid, PointerGetDatum(argument)); result = FunctionCall2(funcCallInfo, PointerGetDatum(argument), Int32GetDatum(argument));Again, this style of coding does not allow for expressing NULL inputsor receiving a NULL result.As with the callee-side situation, I propose adding argument conversionmacros that hide the pass-by-reference nature of int8, float4, andfloat8, with an eye to making those types relatively painless to convertto pass-by-value.The existing helper functions fmgr(), fmgr_c(), etc will be left inplace until all uses of them are gone. Of course their internals willhave to change in the first step of implementation, but they cancontinue to support the same external appearance.Support for TOAST-able data types---------------------------------For TOAST-able data types, the PG_GETARG macro will deliver a de-TOASTeddata value. There might be a few cases where the still-toasted value iswanted, but the vast majority of cases want the de-toasted result, sothat will be the default. To get the argument value without causingde-toasting, use PG_GETARG_RAW_VARLENA_P(n).Some functions require a modifiable copy of their input values. In thesecases, it's silly to do an extra copy step if we copied the data anywayto de-TOAST it. Therefore, each toastable datatype has an additionalfetch macro, for example PG_GETARG_TEXT_P_COPY(n), which delivers aguaranteed-fresh copy, combining this with the detoasting step if possible.There is also a PG_FREE_IF_COPY(ptr,n) macro, which pfree's the givenpointer if and only if it is different from the original value of the n'thargument. This can be used to free the de-toasted value of the n'thargument, if it was actually de-toasted. Currently, doing this is notnecessary for the majority of functions because the core backend codereleases temporary space periodically, so that memory leaked in functionexecution isn't a big problem. However, as of 7.1 memory leaks infunctions that are called by index searches will not be cleaned up untilend of transaction. Therefore, functions that are listed in pg_amop orpg_amproc should be careful not to leak detoasted copies, and so thesefunctions do need to use PG_FREE_IF_COPY() for toastable inputs.A function should never try to re-TOAST its result value; it should justdeliver an untoasted result that's been palloc'd in the current memorycontext. When and if the value is actually stored into a tuple, thetuple toaster will decide whether toasting is needed.Functions accepting or returning sets-------------------------------------[ this section revised 29-Aug-2002 for 7.3 ]If a function is marked in pg_proc as returning a set, then it is calledwith fcinfo->resultinfo pointing to a node of type ReturnSetInfo. Afunction that desires to return a set should raise an error "called incontext that does not accept a set result" if resultinfo is NULL or doesnot point to a ReturnSetInfo node.There are currently two modes in which a function can return a set result:value-per-call, or materialize. In value-per-call mode, the function returnsone value each time it is called, and finally reports "done" when it has nomore values to return. In materialize mode, the function's output set isinstantiated in a Tuplestore object; all the values are returned in one call.Additional modes might be added in future.ReturnSetInfo contains a field "allowedModes" which is set (by the caller)to a bitmask that's the OR of the modes the caller can support. The actualmode used by the function is returned in another field "returnMode". Forbackwards-compatibility reasons, returnMode is initialized to value-per-calland need only be changed if the function wants to use a different mode.The function should ereport() if it cannot use any of the modes the caller iswilling to support.Value-per-call mode works like this: ReturnSetInfo contains a field"isDone", which should be set to one of these values: ExprSingleResult /* expression does not return a set */ ExprMultipleResult /* this result is an element of a set */ ExprEndResult /* there are no more elements in the set */(the caller will initialize it to ExprSingleResult). If the function simplyreturns a Datum without touching ReturnSetInfo, then the call is over and asingle-item set has been returned. To return a set, the function must setisDone to ExprMultipleResult for each set element. After all elements havebeen returned, the next call should set isDone to ExprEndResult and return anull result. (Note it is possible to return an empty set by doing this onthe first call.)The ReturnSetInfo node also contains a link to the ExprContext within whichthe function is being evaluated. This is useful for value-per-call functionsthat need to close down internal state when they are not run to completion:they can register a shutdown callback function in the ExprContext.Materialize mode works like this: the function creates a Tuplestore holdingthe (possibly empty) result set, and returns it. There are no multiple calls.The function must also return a TupleDesc that indicates the tuple structure.The Tuplestore and TupleDesc should be created in the contextecontext->ecxt_per_query_memory (note this will *not* be the context thefunction is called in). The function stores pointers to the Tuplestore andTupleDesc into ReturnSetInfo, sets returnMode to indicate materialize mode,and returns null. isDone is not used and should be left at ExprSingleResult.If the function is being called as a table function (ie, it appears in aFROM item), then the expected tuple descriptor is passed in ReturnSetInfo;in other contexts the expectedDesc field will be NULL. The function neednot pay attention to expectedDesc, but it may be useful in special cases.There is no support for functions accepting sets; instead, the function willbe called multiple times, once for each element of the input set.Notes about function handlers-----------------------------Handlers for classes of functions should find life much easier andcleaner in this design. The OID of the called function is directlyreachable from the passed parameters; we don't need the global variablefmgr_pl_finfo anymore. Also, by modifying fcinfo->flinfo->fn_extra,the handler can cache lookup info to avoid repeat lookups when the samefunction is invoked many times. (fn_extra can only be used as a hint,since callers are not required to re-use an FmgrInfo struct.But in performance-critical paths they normally will do so.)If the handler wants to allocate memory to hold fn_extra data, it shouldNOT do so in CurrentMemoryContext, since the current context may well bemuch shorter-lived than the context where the FmgrInfo is. Instead,allocate the memory in context flinfo->fn_mcxt, or in a long-lived cachecontext. fn_mcxt normally points at the context that wasCurrentMemoryContext at the time the FmgrInfo structure was created;in any case it is required to be a context at least as long-lived as theFmgrInfo itself.Telling the difference between old- and new-style functions-----------------------------------------------------------During the conversion process, we carried two different pg_languageentries, "internal" and "newinternal", for internal functions. Thefunction manager used the language code to distinguish which callingconvention to use. (Old-style internal functions were supported viaa function handler.) As of Nov. 2000, no old-style internal functionsremain, so we can drop support for them. We will remove the old "internal"pg_language entry and rename "newinternal" to "internal".The interim solution for dynamically-loaded compiled functions has beensimilar: two pg_language entries "C" and "newC". This naming conventionis not desirable for the long run, and yet we cannot stop supportingold-style user functions. Instead, it seems better to use just onepg_language entry "C", and require the dynamically-loaded library toprovide additional information that identifies new-style functions.This avoids compatibility problems --- for example, existing dumpscripts will identify PL language handlers as being in language "C",which would be wrong under the "newC" convention. Also, this approachshould generalize more conveniently for future extensions to the functioninterface specification.Given a dynamically loaded function named "foo" (note that the name beingconsidered here is the link-symbol name, not the SQL-level function name),the function manager will look for another function in the same dynamicallyloaded library named "pg_finfo_foo". If this second function does notexist, then foo is assumed to be called old-style, thus ensuring backwardscompatibility with existing libraries. If the info function does exist,it is expected to have the signature Pg_finfo_record * pg_finfo_foo (void);The info function will be called by the fmgr, and must return a pointerto a Pg_finfo_record struct. (The returned struct will typically be astatically allocated constant in the dynamic-link library.) The currentdefinition of the struct is just typedef struct { int api_version; } Pg_finfo_record;where api_version is 0 to indicate old-style or 1 to indicate new-stylecalling convention. In future releases, additional fields may be definedafter api_version, but these additional fields will only be used ifapi_version is greater than 1.These details will be hidden from the author of a dynamically loadedfunction by using a macro. To define a new-style dynamically loadedfunction named foo, write PG_FUNCTION_INFO_V1(foo); Datum foo(PG_FUNCTION_ARGS) { ... }The function itself is written using the same conventions as for new-styleinternal functions; you just need to add the PG_FUNCTION_INFO_V1() macro.Note that old-style and new-style functions can be intermixed in the samelibrary, depending on whether or not you write a PG_FUNCTION_INFO_V1() foreach one.The SQL declaration for a dynamically-loaded function is CREATE FUNCTIONfoo ... LANGUAGE 'C' regardless of whether it is old- or new-style.New-style dynamic functions will be invoked directly by fmgr, and willtherefore have the same performance as internal functions after the initialpg_proc lookup overhead. Old-style dynamic functions will be invoked viaa handler, and will therefore have a small performance penalty.To allow old-style dynamic functions to work safely on toastable datatypes,the handler for old-style functions will automatically detoast toastablearguments before passing them to the old-style function. A new-stylefunction is expected to take care of toasted arguments by using thestandard argument access macros defined above.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -