ucd.html
来自「perl教程」· HTML 代码 · 共 410 行 · 第 1/2 页
HTML
410 行
<?xml version="1.0" ?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<!-- saved from url=(0017)http://localhost/ -->
<script language="JavaScript" src="../../displayToc.js"></script>
<script language="JavaScript" src="../../tocParas.js"></script>
<script language="JavaScript" src="../../tocTab.js"></script>
<link rel="stylesheet" type="text/css" href="../../scineplex.css">
<title>Unicode::UCD - Unicode character database</title>
<link rel="stylesheet" href="../../Active.css" type="text/css" />
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<link rev="made" href="mailto:" />
</head>
<body>
<script>writelinks('__top__',2);</script>
<h1><a>Unicode::UCD - Unicode character database</a></h1>
<p><a name="__index__"></a></p>
<!-- INDEX BEGIN -->
<ul>
<li><a href="#name">NAME</a></li>
<li><a href="#synopsis">SYNOPSIS</a></li>
<li><a href="#description">DESCRIPTION</a></li>
<ul>
<li><a href="#charinfo">charinfo</a></li>
<li><a href="#charblock">charblock</a></li>
<li><a href="#charscript">charscript</a></li>
<li><a href="#charblocks">charblocks</a></li>
<li><a href="#charscripts">charscripts</a></li>
<li><a href="#blocks_versus_scripts">Blocks versus Scripts</a></li>
<li><a href="#matching_scripts_and_blocks">Matching Scripts and Blocks</a></li>
<li><a href="#code_point_arguments">Code Point Arguments</a></li>
<li><a href="#charinrange">charinrange</a></li>
<li><a href="#compexcl">compexcl</a></li>
<li><a href="#casefold">casefold</a></li>
<li><a href="#casespec">casespec</a></li>
<li><a href="#namedseq__"><code>namedseq()</code></a></li>
<li><a href="#unicode__ucd__unicodeversion">Unicode::UCD::UnicodeVersion</a></li>
<li><a href="#implementation_note">Implementation Note</a></li>
</ul>
<li><a href="#bugs">BUGS</a></li>
<li><a href="#author">AUTHOR</a></li>
</ul>
<!-- INDEX END -->
<hr />
<p>
</p>
<h1><a name="name">NAME</a></h1>
<p>Unicode::UCD - Unicode character database</p>
<p>
</p>
<hr />
<h1><a name="synopsis">SYNOPSIS</a></h1>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charinfo'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$charinfo</span> <span class="operator">=</span> <span class="variable">charinfo</span><span class="operator">(</span><span class="variable">$codepoint</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charblock'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$charblock</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="variable">$codepoint</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charscript'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$charscript</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="variable">$codepoint</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charblocks'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$charblocks</span> <span class="operator">=</span> <span class="variable">charblocks</span><span class="operator">();</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charscripts'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">%charscripts</span> <span class="operator">=</span> <span class="variable">charscripts</span><span class="operator">();</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">qw(charscript charinrange)</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$range</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="variable">$script</span><span class="operator">);</span>
<span class="keyword">print</span> <span class="string">"looks like $script\n"</span> <span class="keyword">if</span> <span class="variable">charinrange</span><span class="operator">(</span><span class="variable">$range</span><span class="operator">,</span> <span class="variable">$codepoint</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'compexcl'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$compexcl</span> <span class="operator">=</span> <span class="variable">compexcl</span><span class="operator">(</span><span class="variable">$codepoint</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'namedseq'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$namedseq</span> <span class="operator">=</span> <span class="variable">namedseq</span><span class="operator">(</span><span class="variable">$named_sequence_name</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$unicode_version</span> <span class="operator">=</span> <span class="variable">Unicode::UCD::UnicodeVersion</span><span class="operator">();</span>
</pre>
<p>
</p>
<hr />
<h1><a name="description">DESCRIPTION</a></h1>
<p>The Unicode::UCD module offers a simple interface to the Unicode
Character Database.</p>
<p>
</p>
<h2><a name="charinfo">charinfo</a></h2>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charinfo'</span><span class="operator">;</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$charinfo</span> <span class="operator">=</span> <span class="variable">charinfo</span><span class="operator">(</span><span class="number">0x41</span><span class="operator">);</span>
</pre>
<p><code>charinfo()</code> returns a reference to a hash that has the following fields
as defined by the Unicode standard:</p>
<pre>
key</pre>
<pre>
code code point with at least four hexdigits
name name of the character IN UPPER CASE
category general category of the character
combining classes used in the Canonical Ordering Algorithm
bidi bidirectional category
decomposition character decomposition mapping
decimal if decimal digit this is the integer numeric value
digit if digit this is the numeric value
numeric if numeric is the integer or rational numeric value
mirrored if mirrored in bidirectional text
unicode10 Unicode 1.0 name if existed and different
comment ISO 10646 comment field
upper uppercase equivalent mapping
lower lowercase equivalent mapping
title titlecase equivalent mapping</pre>
<pre>
<span class="variable">block</span> <span class="variable">block</span> <span class="variable">the</span> <span class="variable">character</span> <span class="variable">belongs</span> <span class="variable">to</span> <span class="operator">(</span><span class="variable">used</span> <span class="variable">in</span> <span class="operator">\</span><span class="variable">p</span><span class="operator">{</span><span class="variable">In</span><span class="operator">...})</span>
<span class="variable">script</span> <span class="variable">script</span> <span class="variable">the</span> <span class="variable">character</span> <span class="variable">belongs</span> <span class="variable">to</span>
</pre>
<p>If no match is found, a reference to an empty hash is returned.</p>
<p>The <code>block</code> property is the same as returned by charinfo(). It is
not defined in the Unicode Character Database proper (Chapter 4 of the
Unicode 3.0 Standard, aka TUS3) but instead in an auxiliary database
(Chapter 14 of TUS3). Similarly for the <code>script</code> property.</p>
<p>Note that you cannot do (de)composition and casing based solely on the
above <code>decomposition</code> and <code>lower</code>, <code>upper</code>, <code>title</code>, properties,
you will need also the compexcl(), casefold(), and <code>casespec()</code> functions.</p>
<p>
</p>
<h2><a name="charblock">charblock</a></h2>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charblock'</span><span class="operator">;</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$charblock</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="number">0x41</span><span class="operator">);</span>
<span class="keyword">my</span> <span class="variable">$charblock</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="number">1234</span><span class="operator">);</span>
<span class="keyword">my</span> <span class="variable">$charblock</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="string">"0x263a"</span><span class="operator">);</span>
<span class="keyword">my</span> <span class="variable">$charblock</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="string">"U+263a"</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$range</span> <span class="operator">=</span> <span class="variable">charblock</span><span class="operator">(</span><span class="string">'Armenian'</span><span class="operator">);</span>
</pre>
<p>With a <strong>code point argument</strong> <code>charblock()</code> returns the <em>block</em> the character
belongs to, e.g. <code>Basic Latin</code>. Note that not all the character
positions within all blocks are defined.</p>
<p>See also <a href="#blocks_versus_scripts">Blocks versus Scripts</a>.</p>
<p>If supplied with an argument that can't be a code point, <code>charblock()</code> tries
to do the opposite and interpret the argument as a character block. The
return value is a <em>range</em>: an anonymous list of lists that contain
<em>start-of-range</em>, <em>end-of-range</em> code point pairs. You can test whether
a code point is in a range using the <a href="#charinrange">charinrange</a> function. If the
argument is not a known character block, <a href="../../lib/Pod/perlfunc.html#item_undef"><code>undef</code></a> is returned.</p>
<p>
</p>
<h2><a name="charscript">charscript</a></h2>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charscript'</span><span class="operator">;</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$charscript</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="number">0x41</span><span class="operator">);</span>
<span class="keyword">my</span> <span class="variable">$charscript</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="number">1234</span><span class="operator">);</span>
<span class="keyword">my</span> <span class="variable">$charscript</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="string">"U+263a"</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$range</span> <span class="operator">=</span> <span class="variable">charscript</span><span class="operator">(</span><span class="string">'Thai'</span><span class="operator">);</span>
</pre>
<p>With a <strong>code point argument</strong> <code>charscript()</code> returns the <em>script</em> the
character belongs to, e.g. <code>Latin</code>, <code>Greek</code>, <code>Han</code>.</p>
<p>See also <a href="#blocks_versus_scripts">Blocks versus Scripts</a>.</p>
<p>If supplied with an argument that can't be a code point, <code>charscript()</code> tries
to do the opposite and interpret the argument as a character script. The
return value is a <em>range</em>: an anonymous list of lists that contain
<em>start-of-range</em>, <em>end-of-range</em> code point pairs. You can test whether a
code point is in a range using the <a href="#charinrange">charinrange</a> function. If the
argument is not a known character script, <a href="../../lib/Pod/perlfunc.html#item_undef"><code>undef</code></a> is returned.</p>
<p>
</p>
<h2><a name="charblocks">charblocks</a></h2>
<pre>
<span class="keyword">use</span> <span class="variable">Unicode::UCD</span> <span class="string">'charblocks'</span><span class="operator">;</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$charblocks</span> <span class="operator">=</span> <span class="variable">charblocks</span><span class="operator">();</span>
</pre>
<p><code>charblocks()</code> returns a reference to a hash with the known block names
as the keys, and the code point ranges (see <a href="#charblock">charblock</a>) as the values.</p>
<p>See also <a href="#blocks_versus_scripts">Blocks versus Scripts</a>.</p>
<p>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?