📄 ucnv.h
字号:
* If the alias is ambiguous, then the preferred converter is used * and the status is set to U_AMBIGUOUS_ALIAS_WARNING. * @param name : name of the uconv table in a zero terminated * Unicode string * @param err outgoing error status <TT>U_MEMORY_ALLOCATION_ERROR, * U_FILE_ACCESS_ERROR</TT> * @return the created Unicode converter object, or <TT>NULL</TT> if an * error occured * @see ucnv_open * @see ucnv_openCCSID * @see ucnv_close * @see ucnv_getDefaultName * @stable ICU 2.0 */U_STABLE UConverter* U_EXPORT2 ucnv_openU(const UChar *name, UErrorCode *err);/** * Creates a UConverter object from a CCSID number and platform pair. * Note that the usefulness of this function is limited to platforms with numeric * encoding IDs. Only IBM and Microsoft platforms use numeric (16-bit) identifiers for * encodings. * * In addition, IBM CCSIDs and Unicode conversion tables are not 1:1 related. * For many IBM CCSIDs there are multiple (up to six) Unicode conversion tables, and * for some Unicode conversion tables there are multiple CCSIDs. * Some "alternate" Unicode conversion tables are provided by the * IBM CDRA conversion table registry. * The most prominent example of a systematic modification of conversion tables that is * not provided in the form of conversion table files in the repository is * that S/390 Unix System Services swaps the codes for Line Feed and New Line in all * EBCDIC codepages, which requires such a swap in the Unicode conversion tables as well. * * Only IBM default conversion tables are accessible with ucnv_openCCSID(). * ucnv_getCCSID() will return the same CCSID for all conversion tables that are associated * with that CCSID. * * Currently, the only "platform" supported in the ICU converter API is UCNV_IBM. * * In summary, the use of CCSIDs and the associated API functions is not recommended. * * In order to open a converter with the default IBM CDRA Unicode conversion table, * you can use this function or use the prefix "ibm-": * \code * char name[20]; * sprintf(name, "ibm-%hu", ccsid); * cnv=ucnv_open(name, &errorCode); * \endcode * * In order to open a converter with the IBM S/390 Unix System Services variant * of a Unicode/EBCDIC conversion table, * you can use the prefix "ibm-" together with the option string UCNV_SWAP_LFNL_OPTION_STRING: * \code * char name[20]; * sprintf(name, "ibm-%hu" UCNV_SWAP_LFNL_OPTION_STRING, ccsid); * cnv=ucnv_open(name, &errorCode); * \endcode * * In order to open a converter from a Microsoft codepage number, use the prefix "cp": * \code * char name[20]; * sprintf(name, "cp%hu", codepageID); * cnv=ucnv_open(name, &errorCode); * \endcode * * If the alias is ambiguous, then the preferred converter is used * and the status is set to U_AMBIGUOUS_ALIAS_WARNING. * * @param codepage codepage number to create * @param platform the platform in which the codepage number exists * @param err error status <TT>U_MEMORY_ALLOCATION_ERROR, U_FILE_ACCESS_ERROR</TT> * @return the created Unicode converter object, or <TT>NULL</TT> if an error * occured. * @see ucnv_open * @see ucnv_openU * @see ucnv_close * @see ucnv_getCCSID * @see ucnv_getPlatform * @see UConverterPlatform * @stable ICU 2.0 */U_STABLE UConverter* U_EXPORT2ucnv_openCCSID(int32_t codepage, UConverterPlatform platform, UErrorCode * err);/** * <p>Creates a UConverter object specified from a packageName and a converterName.</p> * * <p>The packageName and converterName must point to an ICU udata object, as defined by * <code> udata_open( packageName, "cnv", converterName, err) </code> or equivalent. * Typically, packageName will refer to a (.dat) file, or to a package registered with * udata_setAppData().</p> * * <p>The name will NOT be looked up in the alias mechanism, nor will the converter be * stored in the converter cache or the alias table. The only way to open further converters * is call this function multiple times, or use the ucnv_safeClone() function to clone a * 'master' converter.</p> * * <p>A future version of ICU may add alias table lookups and/or caching * to this function.</p> * * <p>Example Use: * <code>cnv = ucnv_openPackage("myapp", "myconverter", &err);</code> * </p> * * @param packageName name of the package (equivalent to 'path' in udata_open() call) * @param converterName name of the data item to be used, without suffix. * @param err outgoing error status <TT>U_MEMORY_ALLOCATION_ERROR, U_FILE_ACCESS_ERROR</TT> * @return the created Unicode converter object, or <TT>NULL</TT> if an error occured * @see udata_open * @see ucnv_open * @see ucnv_safeClone * @see ucnv_close * @stable ICU 2.2 */U_STABLE UConverter* U_EXPORT2 ucnv_openPackage(const char *packageName, const char *converterName, UErrorCode *err);/** * Thread safe cloning operation * @param cnv converter to be cloned * @param stackBuffer user allocated space for the new clone. If NULL new memory will be allocated. * If buffer is not large enough, new memory will be allocated. * Clients can use the U_CNV_SAFECLONE_BUFFERSIZE. This will probably be enough to avoid memory allocations. * @param pBufferSize pointer to size of allocated space. * If *pBufferSize == 0, a sufficient size for use in cloning will * be returned ('pre-flighting') * If *pBufferSize is not enough for a stack-based safe clone, * new memory will be allocated. * @param status to indicate whether the operation went on smoothly or there were errors * An informational status value, U_SAFECLONE_ALLOCATED_ERROR, is used if any allocations were necessary. * @return pointer to the new clone * @stable ICU 2.0 */U_STABLE UConverter * U_EXPORT2 ucnv_safeClone(const UConverter *cnv, void *stackBuffer, int32_t *pBufferSize, UErrorCode *status);/** * \def U_CNV_SAFECLONE_BUFFERSIZE * Definition of a buffer size that is designed to be large enough for * converters to be cloned with ucnv_safeClone(). * @stable ICU 2.0 */#define U_CNV_SAFECLONE_BUFFERSIZE 1024/** * Deletes the unicode converter and releases resources associated * with just this instance. * Does not free up shared converter tables. * * @param converter the converter object to be deleted * @see ucnv_open * @see ucnv_openU * @see ucnv_openCCSID * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_close(UConverter * converter);/** * Fills in the output parameter, subChars, with the substitution characters * as multiple bytes. * * @param converter the Unicode converter * @param subChars the subsitution characters * @param len on input the capacity of subChars, on output the number * of bytes copied to it * @param err the outgoing error status code. * If the substitution character array is too small, an * <TT>U_INDEX_OUTOFBOUNDS_ERROR</TT> will be returned. * @see ucnv_setSubstChars * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_getSubstChars(const UConverter *converter, char *subChars, int8_t *len, UErrorCode *err);/** * Sets the substitution chars when converting from unicode to a codepage. The * substitution is specified as a string of 1-4 bytes, and may contain * <TT>NULL</TT> byte. * @param converter the Unicode converter * @param subChars the substitution character byte sequence we want set * @param len the number of bytes in subChars * @param err the error status code. <TT>U_INDEX_OUTOFBOUNDS_ERROR </TT> if * len is bigger than the maximum number of bytes allowed in subchars * @see ucnv_getSubstChars * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_setSubstChars(UConverter *converter, const char *subChars, int8_t len, UErrorCode *err);/** * Fills in the output parameter, errBytes, with the error characters from the * last failing conversion. * * @param converter the Unicode converter * @param errBytes the codepage bytes which were in error * @param len on input the capacity of errBytes, on output the number of * bytes which were copied to it * @param err the error status code. * If the substitution character array is too small, an * <TT>U_INDEX_OUTOFBOUNDS_ERROR</TT> will be returned. * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_getInvalidChars(const UConverter *converter, char *errBytes, int8_t *len, UErrorCode *err);/** * Fills in the output parameter, errChars, with the error characters from the * last failing conversion. * * @param converter the Unicode converter * @param errUChars the UChars which were in error * @param len on input the capacity of errUChars, on output the number of * UChars which were copied to it * @param err the error status code. * If the substitution character array is too small, an * <TT>U_INDEX_OUTOFBOUNDS_ERROR</TT> will be returned. * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_getInvalidUChars(const UConverter *converter, UChar *errUChars, int8_t *len, UErrorCode *err);/** * Resets the state of a converter to the default state. This is used * in the case of an error, to restart a conversion from a known default state. * It will also empty the internal output buffers. * @param converter the Unicode converter * @stable ICU 2.0 */U_STABLE void U_EXPORT2ucnv_reset(UConverter *converter);/** * Resets the to-Unicode part of a converter state to the default state. * This is used in the case of an error to restart a conversion to * Unicode to a known default state. It will also empty the internal * output buffers used for the conversion to Unicode codepoints. * @param converter the Unicode converter * @stable ICU 2.0 */U_STABLE void U_EXPORT2 ucnv_resetToUnicode(UConverter *converter);/** * Resets the from-Unicode part of a converter state to the default state. * This is used in the case of an error to restart a conversion from * Unicode to a known default state. It will also empty the internal output * buffers used for the conversion from Unicode codepoints. * @param converter the Unicode converter * @stable ICU 2.0 */U_STABLE void U_EXPORT2 ucnv_resetFromUnicode(UConverter *converter);/** * Returns the maximum number of bytes that are output per UChar in conversion * from Unicode using this converter. * The returned number can be used with UCNV_GET_MAX_BYTES_FOR_STRING * to calculate the size of a target buffer for conversion from Unicode. * * Note: Before ICU 2.8, this function did not return reliable numbers for * some stateful converters (EBCDIC_STATEFUL, ISO-2022) and LMBCS. * * This number may not be the same as the maximum number of bytes per * "conversion unit". In other words, it may not be the intuitively expected * number of bytes per character that would be published for a charset, * and may not fulfill any other purpose than the allocation of an output * buffer of guaranteed sufficient size for a given input length and converter. * * Examples for special cases that are taken into account: * - Supplementary code points may convert to more bytes than BMP code points. * This function returns bytes per UChar (UTF-16 code unit), not per * Unicode code point, for efficient buffer allocation. * - State-shifting output (SI/SO, escapes, etc.) from stateful converters. * - When m input UChars are converted to n output bytes, then the maximum m/n * is taken into account. * * The number returned here does not take into account * (see UCNV_GET_MAX_BYTES_FOR_STRING): * - callbacks which output more than one charset character sequence per call, * like escape callbacks * - initial and final non-character bytes that are output by some converters * (automatic BOMs, initial escape sequence, final SI, etc.) * * Examples for returned values: * - SBCS charsets: 1 * - Shift-JIS: 2 * - UTF-16: 2 (2 per BMP, 4 per surrogate _pair_, BOM not counted) * - UTF-8: 3 (3 per BMP, 4 per surrogate _pair_) * - EBCDIC_STATEFUL (EBCDIC mixed SBCS/DBCS): 3 (SO + DBCS) * - ISO-2022: 3 (always outputs UTF-8) * - ISO-2022-JP: 6 (4-byte escape sequences + DBCS) * - ISO-2022-CN: 8 (4-byte designator sequences + 2-byte SS2/SS3 + DBCS) * * @param converter The Unicode converter.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -