📄 xfunc.sgml
字号:
return result;}Point *makepoint(Point *pointx, Point *pointy){ Point *new_point = (Point *) palloc(sizeof(Point)); new_point->x = pointx->x; new_point->y = pointy->y; return new_point;}/* by reference, variable length */text *copytext(text *t){ /* * VARSIZE is the total size of the struct in bytes. */ text *new_t = (text *) palloc(VARSIZE(t)); VARATT_SIZEP(new_t) = VARSIZE(t); /* * VARDATA is a pointer to the data region of the struct. */ memcpy((void *) VARDATA(new_t), /* destination */ (void *) VARDATA(t), /* source */ VARSIZE(t)-VARHDRSZ); /* how many bytes */ return new_t;}text *concat_text(text *arg1, text *arg2){ int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ; text *new_text = (text *) palloc(new_text_size); VARATT_SIZEP(new_text) = new_text_size; memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ); memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ); return new_text;}</programlisting> </para> <para> Supposing that the above code has been prepared in file <filename>funcs.c</filename> and compiled into a shared object, we could define the functions to <productname>PostgreSQL</productname> with commands like this: <programlisting>CREATE FUNCTION add_one(integer) RETURNS integer AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one' LANGUAGE C STRICT;-- note overloading of SQL function name "add_one"CREATE FUNCTION add_one(double precision) RETURNS double precision AS '<replaceable>DIRECTORY</replaceable>/funcs', 'add_one_float8' LANGUAGE C STRICT;CREATE FUNCTION makepoint(point, point) RETURNS point AS '<replaceable>DIRECTORY</replaceable>/funcs', 'makepoint' LANGUAGE C STRICT; CREATE FUNCTION copytext(text) RETURNS text AS '<replaceable>DIRECTORY</replaceable>/funcs', 'copytext' LANGUAGE C STRICT;CREATE FUNCTION concat_text(text, text) RETURNS text AS '<replaceable>DIRECTORY</replaceable>/funcs', 'concat_text', LANGUAGE C STRICT;</programlisting> </para> <para> Here, <replaceable>DIRECTORY</replaceable> stands for the directory of the shared library file (for instance the <productname>PostgreSQL</productname> tutorial directory, which contains the code for the examples used in this section). (Better style would be to use just <literal>'funcs'</> in the <literal>AS</> clause, after having added <replaceable>DIRECTORY</replaceable> to the search path. In any case, we may omit the system-specific extension for a shared library, commonly <literal>.so</literal> or <literal>.sl</literal>.) </para> <para> Notice that we have specified the functions as <quote>strict</quote>, meaning that the system should automatically assume a null result if any input value is null. By doing this, we avoid having to check for null inputs in the function code. Without this, we'd have to check for null values explicitly, by checking for a null pointer for each pass-by-reference argument. (For pass-by-value arguments, we don't even have a way to check!) </para> <para> Although this calling convention is simple to use, it is not very portable; on some architectures there are problems with passing data types that are smaller than <type>int</type> this way. Also, there is no simple way to return a null result, nor to cope with null arguments in any way other than making the function strict. The version-1 convention, presented next, overcomes these objections. </para> </sect2> <sect2> <title>Calling Conventions Version 1 for C-Language Functions</title> <para> The version-1 calling convention relies on macros to suppress most of the complexity of passing arguments and results. The C declaration of a version-1 function is always<programlisting>Datum funcname(PG_FUNCTION_ARGS)</programlisting> In addition, the macro call<programlisting>PG_FUNCTION_INFO_V1(funcname);</programlisting> must appear in the same source file. (Conventionally. it's written just before the function itself.) This macro call is not needed for <literal>internal</>-language functions, since <productname>PostgreSQL</> assumes that all internal functions use the version-1 convention. It is, however, required for dynamically-loaded functions. </para> <para> In a version-1 function, each actual argument is fetched using a <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macro that corresponds to the argument's data type, and the result is returned using a <function>PG_RETURN_<replaceable>xxx</replaceable>()</function> macro for the return type. <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> takes as its argument the number of the function argument to fetch, where the count starts at 0. <function>PG_RETURN_<replaceable>xxx</replaceable>()</function> takes as its argument the actual value to return. </para> <para> Here we show the same functions as above, coded in version-1 style:<programlisting>#include "postgres.h"#include <string.h>#include "fmgr.h"/* by value */PG_FUNCTION_INFO_V1(add_one); Datumadd_one(PG_FUNCTION_ARGS){ int32 arg = PG_GETARG_INT32(0); PG_RETURN_INT32(arg + 1);}/* b reference, fixed length */PG_FUNCTION_INFO_V1(add_one_float8);Datumadd_one_float8(PG_FUNCTION_ARGS){ /* The macros for FLOAT8 hide its pass-by-reference nature. */ float8 arg = PG_GETARG_FLOAT8(0); PG_RETURN_FLOAT8(arg + 1.0);}PG_FUNCTION_INFO_V1(makepoint);Datummakepoint(PG_FUNCTION_ARGS){ /* Here, the pass-by-reference nature of Point is not hidden. */ Point *pointx = PG_GETARG_POINT_P(0); Point *pointy = PG_GETARG_POINT_P(1); Point *new_point = (Point *) palloc(sizeof(Point)); new_point->x = pointx->x; new_point->y = pointy->y; PG_RETURN_POINT_P(new_point);}/* by reference, variable length */PG_FUNCTION_INFO_V1(copytext);Datumcopytext(PG_FUNCTION_ARGS){ text *t = PG_GETARG_TEXT_P(0); /* * VARSIZE is the total size of the struct in bytes. */ text *new_t = (text *) palloc(VARSIZE(t)); VARATT_SIZEP(new_t) = VARSIZE(t); /* * VARDATA is a pointer to the data region of the struct. */ memcpy((void *) VARDATA(new_t), /* destination */ (void *) VARDATA(t), /* source */ VARSIZE(t)-VARHDRSZ); /* how many bytes */ PG_RETURN_TEXT_P(new_t);}PG_FUNCTION_INFO_V1(concat_text);Datumconcat_text(PG_FUNCTION_ARGS){ text *arg1 = PG_GETARG_TEXT_P(0); text *arg2 = PG_GETARG_TEXT_P(1); int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ; text *new_text = (text *) palloc(new_text_size); VARATT_SIZEP(new_text) = new_text_size; memcpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ); memcpy(VARDATA(new_text) + (VARSIZE(arg1)-VARHDRSZ), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ); PG_RETURN_TEXT_P(new_text);}</programlisting> </para> <para> The <command>CREATE FUNCTION</command> commands are the same as for the version-0 equivalents. </para> <para> At first glance, the version-1 coding conventions may appear to be just pointless obscurantism. They do, however, offer a number of improvements, because the macros can hide unnecessary detail. An example is that in coding <function>add_one_float8</>, we no longer need to be aware that <type>float8</type> is a pass-by-reference type. Another example is that the <literal>GETARG</> macros for variable-length types allow for more efficient fetching of <quote>toasted</quote> (compressed or out-of-line) values. </para> <para> One big improvement in version-1 functions is better handling of null inputs and results. The macro <function>PG_ARGISNULL(<replaceable>n</>)</function> allows a function to test whether each input is null. (Of course, doing this is only necessary in functions not declared <quote>strict</>.) As with the <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros, the input arguments are counted beginning at zero. Note that one should refrain from executing <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> until one has verified that the argument isn't null. To return a null result, execute <function>PG_RETURN_NULL()</function>; this works in both strict and nonstrict functions. </para> <para> Other options provided in the new-style interface are two variants of the <function>PG_GETARG_<replaceable>xxx</replaceable>()</function> macros. The first of these, <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function>, guarantees to return a copy of the specified argument that is safe for writing into. (The normal macros will sometimes return a pointer to a value that is physically stored in a table, which must not be written to. Using the <function>PG_GETARG_<replaceable>xxx</replaceable>_COPY()</function> macros guarantees a writable result.) The second variant consists of the <function>PG_GETARG_<replaceable>xxx</replaceable>_SLICE()</function> macros which take three arguments. The first is the number of the function argument (as above). The second and third are the offset and length of the segment to be returned. Offsets are counted from zero, and a negative length requests that the remainder of the value be returned. These macros provide more efficient access to parts of large values in the case where they have storage type <quote>external</quote>. (The storage type of a column can be specified using <literal>ALTER TABLE <replaceable>tablename</replaceable> ALTER COLUMN <replaceable>colname</replaceable> SET STORAGE <replaceable>storagetype</replaceable></literal>. <replaceable>storagetype</replaceable> is one of <literal>plain</>, <literal>external</>, <literal>extended</literal>, or <literal>main</>.) </para> <para> Finally, the version-1 function call conventions make it possible to return set results (<xref linkend="xfunc-c-return-set">) and implement trigger functions (<xref linkend="triggers">) and procedural-language call handlers (<xref linkend="plhandler">). Version-1 code is also more portable than version-0, because it does not break restrictions on function call protocol in the C standard. For more details see <filename>src/backend/utils/fmgr/README</filename> in the source distribution. </para> </sect2> <sect2> <title>Writing Code</title> <para> Before we turn to the more advanced topics, we should discuss some coding rules for <productname>PostgreSQL</productname> C-language functions. While it may be possible to load functions written in languages other than C into <productname>PostgreSQL</productname>, this is usually difficult (when it is possible at all) because other languages, such as C++, FORTRAN, or Pascal often do not follow the same calling convention as C. That is, other languages do not pass argument and return values between functions in the same way. For this reason, we will assume that your C-language functions are actually written in C. </para> <para> The basic rules for writing and building C functions are as follows: <itemizedlist> <listitem> <para> Use <literal>pg_config --includedir-server</literal><indexterm><primary>pg_config</><secondary>with user-defined C functions</></> to find out where the <productname>PostgreSQL</> server header files are installed on your system (or the system that your users will be running on). This option is new with <productname>PostgreSQL</> 7.2. For <productname>PostgreSQL</> 7.1 you should use the option <option>--includedir</option>. (<command>pg_config</command> will exit with a non-zero status if it encounters an unknown option.) For releases prior to 7.1 you will have to guess, but since that was before the current calling conventions were introduced, it is unlikely that you want to support those releases. </para> </listitem> <listitem> <para> When allocating memory, use the <productname>PostgreSQL</productname> functions <function>palloc</function><indexterm><primary>palloc</></> and <function>pfree</function><indexterm><primary>pfree</></> instead of the corresponding C library functions <function>malloc</function> and <function>free</function>. The memory allocated by <function>palloc</function> will be freed automatically at the end of each transaction, preventing memory leaks. </para> </listitem> <listitem> <para> Always zero the bytes of your structures using <function>memset</function>. Without this, it's difficult to support hash indexes or hash joins, as you must pick out only the significant bits of your data structure to compute a hash. Even if you initialize all fields of your structure, there may be alignment padding (holes in the structure) that may contain garbage values. </para> </listitem> <listitem> <para> Most of the internal <productname>PostgreSQL</productname> types are declared in <filename>postgres.h</filename>, while the function manager interfaces (<symbol>PG_FUNCTION_ARGS</symbol>, etc.) are in <filename>fmgr.h</filename>, so you will need to include at least these two files. For portability reasons it's best to include <filename>postgres.h</filename> <emphasis>first</>, before any other system or user header files. Including <filename>postgres.h</filename> will also include <filename>elog.h</filename> and <filename>palloc.h</filename> for you. </para> </listitem> <listitem> <para> Symbol names defined within object files must not conflict
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -