📄 typeconv.sgml
字号:
<!--$PostgreSQL: pgsql/doc/src/sgml/typeconv.sgml,v 1.45 2005/11/04 23:14:02 petere Exp $--><chapter Id="typeconv"><title>Type Conversion</title><indexterm zone="typeconv"> <primary>data type</primary> <secondary>conversion</secondary></indexterm><para><acronym>SQL</acronym> statements can, intentionally or not, requiremixing of different data types in the same expression. <productname>PostgreSQL</productname> has extensive facilities forevaluating mixed-type expressions.</para><para>In many cases a user will not needto understand the details of the type conversion mechanism.However, the implicit conversions done by <productname>PostgreSQL</productname>can affect the results of a query. When necessary, these resultscan be tailored by using <emphasis>explicit</emphasis> type conversion.</para><para>This chapter introduces the <productname>PostgreSQL</productname>type conversion mechanisms and conventions.Refer to the relevant sections in <xref linkend="datatype"> and <xref linkend="functions">for more information on specific data types and allowed functions andoperators.</para><sect1 id="typeconv-overview"><title>Overview</title><para><acronym>SQL</acronym> is a strongly typed language. That is, every data itemhas an associated data type which determines its behavior and allowed usage.<productname>PostgreSQL</productname> has an extensible type system that ismuch more general and flexible than other <acronym>SQL</acronym> implementations.Hence, most type conversion behavior in <productname>PostgreSQL</productname>is governed by general rules rather than by <foreignphrase>ad hoc</>heuristics. This allowsmixed-type expressions to be meaningful even with user-defined types.</para><para>The <productname>PostgreSQL</productname> scanner/parser divides lexicalelements into only five fundamental categories: integers, non-integer numbers,strings, identifiers, and key words. Constants of most non-numeric types arefirst classified as strings. The <acronym>SQL</acronym> language definitionallows specifying type names with strings, and this mechanism can be used in<productname>PostgreSQL</productname> to start the parser down the correctpath. For example, the query<screen>SELECT text 'Origin' AS "label", point '(0,0)' AS "value"; label | value--------+------- Origin | (0,0)(1 row)</screen>has two literal constants, of type <type>text</type> and <type>point</type>.If a type is not specified for a string literal, then the placeholder type<type>unknown</type> is assigned initially, to be resolved in laterstages as described below.</para><para>There are four fundamental <acronym>SQL</acronym> constructs requiringdistinct type conversion rules in the <productname>PostgreSQL</productname>parser:<variablelist><varlistentry><term>Function calls</term><listitem><para>Much of the <productname>PostgreSQL</productname> type system is built around arich set of functions. Functions can have one or more arguments.Since <productname>PostgreSQL</productname> permits functionoverloading, the function name alone does not uniquely identify the functionto be called; the parser must select the right function based on the datatypes of the supplied arguments.</para></listitem></varlistentry><varlistentry><term>Operators</term><listitem><para><productname>PostgreSQL</productname> allows expressions withprefix and postfix unary (one-argument) operators,as well as binary (two-argument) operators. Like functions, operators canbe overloaded, and so the same problem of selecting the right operatorexists.</para></listitem></varlistentry><varlistentry><term>Value Storage</term><listitem><para><acronym>SQL</acronym> <command>INSERT</command> and <command>UPDATE</command> statements place the results ofexpressions into a table. The expressions in the statement must be matched upwith, and perhaps converted to, the types of the target columns.</para></listitem></varlistentry><varlistentry><term><literal>UNION</literal>, <literal>CASE</literal>, and related constructs</term><listitem><para>Since all query results from a unionized <command>SELECT</command> statementmust appear in a single set of columns, the types of the results of each<command>SELECT</> clause must be matched up and converted to a uniform set.Similarly, the result expressions of a <literal>CASE</> construct must beconverted to a common type so that the <literal>CASE</> expression as a wholehas a known output type. The same holds for <literal>ARRAY</> constructs,and for the <function>GREATEST</> and <function>LEAST</> functions.</para></listitem></varlistentry></variablelist></para><para>The system catalogs store information about which conversions, called<firstterm>casts</firstterm>, between data types are valid, and how toperform those conversions. Additional casts can be added by the userwith the <command>CREATE CAST</command> command. (This is usuallydone in conjunction with defining new data types. The set of castsbetween the built-in types has been carefully crafted and is best notaltered.)</para><indexterm> <primary>data type</primary> <secondary>category</secondary></indexterm><para>An additional heuristic is provided in the parser to allow better guessesat proper behavior for <acronym>SQL</acronym> standard types. There areseveral basic <firstterm>type categories</firstterm> defined: <type>boolean</type>,<type>numeric</type>, <type>string</type>, <type>bitstring</type>, <type>datetime</type>, <type>timespan</type>, <type>geometric</type>, <type>network</type>,and user-defined. Each category, with the exception of user-defined, hasone or more <firstterm>preferred types</firstterm> which are preferentiallyselected when there is ambiguity.In the user-defined category, each type is its own preferred type.Ambiguous expressions (those with multiple candidate parsing solutions)can therefore often be resolved when there are multiple possible built-in types, butthey will raise an error when there are multiple choices for user-definedtypes.</para><para>All type conversion rules are designed with several principles in mind:<itemizedlist><listitem><para>Implicit conversions should never have surprising or unpredictable outcomes.</para></listitem><listitem><para>User-defined types, of which the parser has no <foreignphrase>a priori</> knowledge, should be<quote>higher</quote> in the type hierarchy. In mixed-type expressions, native types shall alwaysbe converted to a user-defined type (of course, only if conversion is necessary).</para></listitem><listitem><para>User-defined types are not related. Currently, <productname>PostgreSQL</productname>does not have information available to it on relationships between types, other thanhardcoded heuristics for built-in types and implicit relationships based onavailable functions and casts.</para></listitem><listitem><para>There should be no extra overhead from the parser or executorif a query does not need implicit type conversion.That is, if a query is well formulated and the types already match up, then the query should proceedwithout spending extra time in the parser and without introducing unnecessary implicit conversioncalls into the query.</para><para>Additionally, if a query usually requires an implicit conversion for a function, andif then the user defines a new function with the correct argument types, the parsershould use this new function and will no longer do the implicit conversion using the old function.</para></listitem></itemizedlist></para></sect1><sect1 id="typeconv-oper"><title>Operators</title><indexterm zone="typeconv-oper"> <primary>operator</primary> <secondary>type resolution in an invocation</secondary></indexterm> <para> The specific operator to be used in an operator invocation is determined by following the procedure below. Note that this procedure is indirectly affected by the precedence of the involved operators. See <xref linkend="sql-precedence"> for more information. </para><procedure><title>Operator Type Resolution</title><step performance="required"><para>Select the operators to be considered from the<classname>pg_operator</classname> system catalog. If an unqualifiedoperator name was used (the usual case), the operatorsconsidered are those of the right name and argument count that arevisible in the current search path (see <xref linkend="ddl-schemas-path">).If a qualified operator name was given, only operators in the specifiedschema are considered.</para><substeps><step performance="optional"><para>If the search path finds multiple operators of identical argument types,only the one appearing earliest in the path is considered. But operators ofdifferent argument types are considered on an equal footing regardless ofsearch path position.</para></step></substeps></step><step performance="required"><para>Check for an operator accepting exactly the input argument types.If one exists (there can be only one exact match in the set ofoperators considered), use it.</para><substeps><step performance="optional"><para>If one argument of a binary operator invocation is of the <type>unknown</type> type,then assume it is the same type as the other argument for this check.Other cases involving <type>unknown</type> will never find a match atthis step.</para></step></substeps></step><step performance="required"><para>Look for the best match.</para><substeps><step performance="required"><para>Discard candidate operators for which the input types do not matchand cannot be converted (using an implicit conversion) to match.<type>unknown</type> literals areassumed to be convertible to anything for this purpose. If only onecandidate remains, use it; else continue to the next step.</para></step><step performance="required"><para>Run through all candidates and keep those with the most exact matcheson input types. (Domains are considered the same as their base typefor this purpose.) Keep all candidates if none have any exact matches.If only one candidate remains, use it; else continue to the next step.</para></step><step performance="required"><para>Run through all candidates and keep those that accept preferred types (of theinput data type's type category) at the most positions where type conversionwill be required.Keep all candidates if none accept preferred types.If only one candidate remains, use it; else continue to the next step.</para></step><step performance="required"><para>If any input arguments are <type>unknown</type>, check the typecategories accepted at those argument positions by the remainingcandidates. At each position, select the <type>string</type> categoryif anycandidate accepts that category. (This bias towards string is appropriatesince an unknown-type literal does look like a string.) Otherwise, ifall the remaining candidates accept the same type category, select thatcategory; otherwise fail because the correct choice cannot be deducedwithout more clues. Now discardcandidates that do not accept the selected type category. Furthermore,if any candidate accepts a preferred type at a given argument position,discard candidates that accept non-preferred types for that argument.</para></step><step performance="required"><para>If only one candidate remains, use it. If no candidate or more than onecandidate remains,then fail.</para></step></substeps></step></procedure><para>Some examples follow.</para><example><title>Exponentiation Operator Type Resolution</title><para>There is only one exponentiationoperator defined in the catalog, and it takes arguments of type <type>double precision</type>.The scanner assigns an initial type of <type>integer</type> to both argumentsof this query expression:<screen>SELECT 2 ^ 3 AS "exp"; exp----- 8(1 row)</screen>So the parser does a type conversion on both operands and the queryis equivalent to<screen>SELECT CAST(2 AS double precision) ^ CAST(3 AS double precision) AS "exp";</screen></para></example><example><title>String Concatenation Operator Type Resolution</title><para>A string-like syntax is used for working with string types as well as forworking with complex extension types.Strings with unspecified type are matched with likely operator candidates.</para><para>An example with one unspecified argument:<screen>SELECT text 'abc' || 'def' AS "text and unknown"; text and unknown------------------ abcdef(1 row)</screen></para><para>In this case the parser looks to see if there is an operator taking <type>text</type>for both arguments. Since there is, it assumes that the second argument shouldbe interpreted as of type <type>text</type>.</para><para>Here is a concatenation on unspecified types:<screen>SELECT 'abc' || 'def' AS "unspecified"; unspecified------------- abcdef(1 row)</screen></para><para>In this case there is no initial hint for which type to use, since no typesare specified in the query. So, the parser looks for all candidate operatorsand finds that there are candidates accepting both string-category andbit-string-category inputs. Since string category is preferred when available,that category is selected, and then the preferred type for strings, <type>text</type>, is used as the specifictype to resolve the unknown literals to.</para></example><example><title>Absolute-Value and Negation Operator Type Resolution</title><para>The <productname>PostgreSQL</productname> operator catalog has severalentries for the prefix operator <literal>@</>, all of which implementabsolute-value operations for various numeric data types. One of theseentries is for type <type>float8</type>, which is the preferred type inthe numeric category. Therefore, <productname>PostgreSQL</productname>will use that entry when faced with a non-numeric input:<screen>SELECT @ '-4.5' AS "abs"; abs----- 4.5(1 row)</screen>Here the system has performed an implicit conversion from <type>text</type> to <type>float8</type>before applying the chosen operator. We can verify that <type>float8</type> andnot some other type was used:<screen>SELECT @ '-4.5e500' AS "abs";ERROR: "-4.5e500" is out of range for type double precision</screen></para><para>On the other hand, the prefix operator <literal>~</> (bitwise negation)is defined only for integer data types, not for <type>float8</type>. So, if wetry a similar case with <literal>~</>, we get:<screen>SELECT ~ '20' AS "negation";ERROR: operator is not unique: ~ "unknown"HINT: Could not choose a best candidate operator. You may need to add explicittype casts.</screen>This happens because the system can't decide which of the severalpossible <literal>~</> operators should be preferred. We can helpit out with an explicit cast:<screen>SELECT ~ CAST('20' AS int8) AS "negation"; negation---------- -21(1 row)</screen></para></example>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -