📄 typeconv.sgml
字号:
<chapter Id="typeconv"><title>Type Conversion</title><para><acronym>SQL</acronym> queries can, intentionally or not, requiremixing of different data types in the same expression. <productname>Postgres</productname> has extensive facilities forevaluating mixed-type expressions.</para><para>In many cases a user will not needto understand the details of the type conversion mechanism.However, the implicit conversions done by <productname>Postgres</productname>can affect the apparent results of a query, and these resultscan be tailored by a user or programmerusing <emphasis>explicit</emphasis> type coersion.</para><para>This chapter introduces the <productname>Postgres</productname> type conversion mechanisms and conventions.Refer to the relevant sections in the User's Guide and Programmer's Guidefor more information on specific data types and allowed functions and operators.</para><para>The Programmer's Guide has more details on the exact algorithms used forimplicit type conversion and coersion.</para><sect1><title>Overview</title><para><acronym>SQL</acronym> is a strongly typed language. That is, every data itemhas an associated data type which determines its behavior and allowed usage.<productname>Postgres</productname> has an extensible type system which ismuch more general and flexible than other <acronym>RDBMS</acronym> implementations.Hence, most type conversion behavior in <productname>Postgres</productname>should be governed by general rules rather than by ad-hoc heuristics to allowmixed-type expressions to be meaningful, even with user-defined types.</para><para>The <productname>Postgres</productname> scanner/parser decodes lexical elementsinto only five fundamental categories: integers, floats, strings, names, and keywords.Most extended types are first tokenized into strings. The <acronym>SQL</acronym>language definition allows specifying type names with strings, and this mechanismis used by <productname>Postgres</productname>to start the parser down the correct path. For example, the query<programlisting>tgl=> SELECT text 'Origin' AS "Label", point '(0,0)' AS "Value";Label |Value------+-----Origin|(0,0)(1 row)</programlisting>has two strings, of type <type>text</type> and <type>point</type>.If a type is not specified, then the placeholder type <type>unknown</type>is assigned initially, to be resolved in later stages as described below.</para><para>There are four fundamental <acronym>SQL</acronym> constructs requiringdistinct type conversion rules in the <productname>Postgres</productname>parser:</para><variablelist><varlistentry><term>Operators</term><listitem><para><productname>Postgres</productname> allows expressions withleft- and right-unary (one argument) operators,as well as binary (two argument) operators.</para></listitem></varlistentry><varlistentry><term>Function calls</term><listitem><para>Much of the <productname>Postgres</productname> type system is built around a rich set offunctions. Function calls have one or more arguments which, for any specific query,must be matched to the functions available in the system catalog.</para></listitem></varlistentry><varlistentry><term>Query targets</term><listitem><para><acronym>SQL</acronym> INSERT statements place the results of query into a table. The expressionsin the query must be matched up with, and perhaps converted to, the target columns of the insert.</para></listitem></varlistentry><varlistentry><term>UNION queries</term><listitem><para>Since all select results from a UNION SELECT statement must appear in a single set of columns, the typesof each SELECT clause must be matched up and converted to a uniform set.</para></listitem></varlistentry></variablelist><para>Many of the general type conversion rules use simple conventions built onthe <productname>Postgres</productname> function and operator system tables.There are some heuristics included in the conversion rules to better supportconventions for the <acronym>SQL92</acronym> standard native types such as<type>smallint</type>, <type>integer</type>, and <type>float</type>.</para><para>The <productname>Postgres</productname> parser uses the convention that alltype conversion functions take a single argument of the source type and arenamed with the same name as the target type. Any function meeting thiscriteria is considered to be a valid conversion function, and may be usedby the parser as such. This simple assumption gives the parser the powerto explore type conversion possibilities without hardcoding, allowingextended user-defined types to use these same features transparently.</para><para>An additional heuristic is provided in the parser to allow better guessesat proper behavior for <acronym>SQL</acronym> standard types. There arefive categories of types defined: boolean, string, numeric, geometric,and user-defined. Each category, with the exception of user-defined, hasa "preferred type" which is used to resolve ambiguities in candidates.Each "user-defined" type is its own "preferred type", so ambiguousexpressions (those with multiple candidate parsing solutions)with only one user-defined type can resolve to a single best choice, while those withmultiple user-defined types will remain ambiguous and throw an error.</para><para>Ambiguous expressions which have candidate solutions within only one type category arelikely to resolve, while ambiguous expressions with candidates spanning multiplecategories are likely to throw an error and ask for clarification from the user.</para><sect2><title>Guidelines</title><para>All type conversion rules are designed with several principles in mind:<itemizedlist mark="bullet" spacing="compact"><listitem><para>Implicit conversions should never have suprising or unpredictable outcomes.</para></listitem><listitem><para>User-defined types, of which the parser has no apriori knowledge, should be"higher" in the type heirarchy. In mixed-type expressions, native types shall alwaysbe converted to a user-defined type (of course, only if conversion is necessary).</para></listitem><listitem><para>User-defined types are not related. Currently, <productname>Postgres</productname>does not have information available to it on relationships between types, other thanhardcoded heuristics for built-in types and implicit relationships based on available functionsin the catalog.</para></listitem><listitem><para>There should be no extra overhead from the parser or executorif a query does not need implicit type conversion.That is, if a query is well formulated and the types already match up, then the query should proceedwithout spending extra time in the parser and without introducing unnecessary implicit conversionfunctions into the query.</para><para>Additionally, if a query usually requires an implicit conversion for a function, andif then the user defines an explicit function with the correct argument types, the parsershould use this new function and will no longer do the implicit conversion using the old function.</para></listitem></itemizedlist></para></sect2></sect1><sect1><title>Operators</title><sect2><title>Conversion Procedure</title><procedure><title>Operator Evaluation</title><step performance="required"><para>Check for an exact match in the pg_operator system catalog.</para><substeps><step performance="optional"><para>If one argument of a binary operator is <type>unknown</type>,then assume it is the same type as the other argument.</para></step><step performance="required"><para>Reverse the arguments, and look for an exact match with an operator whichpoints to itself as being commutative.If found, then reverse the arguments in the parse tree and use this operator.</para></step></substeps></step><step performance="required"><para>Look for the best match.</para><substeps><step performance="optional"><para>Make a list of all operators of the same name.</para></step><step performance="required"><para>If only one operator is in the list, use it if the input type can be coerced,and throw an error if the type cannot be coerced.</para></step><step performance="required"><para>Keep all operators with the most explicit matches for types. Keep all if thereare no explicit matches and move to the next step.If only one candidate remains, use it if the type can be coerced.</para></step><step performance="required"><para>If any input arguments are "unknown", categorize the input candidates asboolean, numeric, string, geometric, or user-defined. If there is a mix ofcategories, or more than one user-defined type, throw an error becausethe correct choice cannot be deduced without more clues.If only one category is present, then assign the "preferred type"to the input column which had been previously "unknown".</para></step><step performance="required"><para>Choose the candidate with the most exact type matches, and which matchesthe "preferred type" for each column category from the previous step.If there is still more than one candidate, or if there are none,then throw an error.</para></step></substeps></step></procedure></sect2><sect2><title>Examples</title><sect3><title>Exponentiation Operator</title><para>There is only one exponentiationoperator defined in the catalog, and it takes <type>float8</type> arguments.The scanner assigns an initial type of <type>int4</type> to both argumentsof this query expression:<programlisting>tgl=> select 2 ^ 3 AS "Exp";Exp--- 8(1 row)</programlisting>So the parser does a type conversion on both operands and the queryis equivalent to<programlisting>tgl=> select float8(2) ^ float8(3) AS "Exp";Exp--- 8(1 row)</programlisting>or<programlisting>tgl=> select 2.0 ^ 3.0 AS "Exp";Exp--- 8(1 row)</programlisting><note><para>This last form has the least overhead, since no functions are called to doimplicit type conversion. This is not an issue for small queries, but mayhave an impact on the performance of queries involving large tables.</para></note></para></sect3><sect3><title>String Concatenation</title><para>A string-like syntax is used for working with string types as well as forworking with complex extended types.Strings with unspecified type are matched with likely operator candidates.</para><para>One unspecified argument:<programlisting>tgl=> SELECT text 'abc' || 'def' AS "Text and Unknown";Text and Unknown----------------abcdef(1 row)</programlisting></para><para>In this case the parser looks to see if there is an operator taking <type>text</type>for both arguments. Since there is, it assumes that the second argument shouldbe interpreted as of type <type>text</type>.</para><para>Concatenation on unspecified types:
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -