📄 2utf.1
字号:
.TH 2UTF 1 "27 September 2000" 2UTF "User Manuals".SH NAME2UTF, fromUTF \- translates legacy char-sets to/from .BR unicode (7),decodes MIME messages.SH SYNOPSIS.RB [ 2UTF | fromUTF ].RI [ options ] " " [ charmap_file_or_alias ].BI < input " >" output.SH DESCRIPTION.LP.B 2UTFis a filter which converts legacy char-sets to.BR unicode (7)(UCS - Universal Character Set) and reverse ifpossible. It can also display char-maps,linux console font and a range of.BR unicode (7)glyphs in.BR utf\-8 (7)encoding..B 2UTFuses.BR iconv (3)library, but it still can get char-map for single-byte legacy char-sets fromfrom tables found at.B ftp://ftp.unicode.org/or.B wg15-localepackage database, or other similar files with user defined format.It can invoke external filters specified in configuration file..PD 0.PP.I charmap_file_or_aliasis pathname, alias or filename for the file with char-map definition.Alias or filenameis converted to uppercase. Aliases specified in.B localeschar-mapsare cached in special file if directory/file permissions allow this.If exact match for alias or filename isn't found,.I *alias*or.I *filename*glob pattern is used (except for aliases found in configuration file)..PP.B \-(hyphen-minus) and.B _(low line, spacing underscore) characters in aliases are ignored..PP When invoked without char-map specified a mail message is assumedon standard input. I do.I notprovide any warranty and recommend you to backup your mail if you use.B2UTFas automatic mail filter. See.I procmailrcfile in.I examplessubdirectory. It should handle commontext and multi-part types, MIME style encoded and plain non-standard 8-bit headers. Everythingis converted to.BR utf\-8 (7)encoding.Messages with MIME style.BR pgp (1)signatures should be passed untouched..PD 1.SH OPTIONS.TP.B \-\-stops option checking for the rest of the command line..TP.B \-2 "" \-\-UCS\-2 "" \-\-ucs\-2Outdated option.output (input if.B \-\-reversespecified) 2 byte wide characters..TP.B \-4 "" \-\-UCS\-4 "" \-\-ucs\-4Outdated option.output (input if.B \-\-reversespecified) 4 byte wide characters..TP.B \-8 "" \-\-UTF\-8 "" \-\-utf\-8.B (default)output (input if.B \-\-reversespecified) multi-byte.BR utf\-8 (7)characters..TP.BI \-c FILENAME " " " " \-\-charmap\-file= FILENAME the alternative way to specify filename or pathname for char-map file..TP.B \-C "" \-\-create\-aliasesrescans all available char-map files and (re)creates aliases file.You should have write permission for this file..TP\fB\-d\fR[\fIN\fR] \fB\-\-debug\fR[\fB=\fIN\fR]\fPOutputs debugging info to stderr. Implies.BR \-\-verbose ..I Nis debug level from 1 to 9. Default is 1..TP.B \-e "" \-\-encode\-headersReencode back decoded RFC-2047 MIME words in headers. Can only be used with.BR \-\-iconv=only ..TP\fB\-f\fR[\fIFORMAT\fR] \fB\-\-format\fR[\fB=\fIFORMAT\fR]\fP.BR sscanf (3)format string for reading char-map file. If not specified the default format as output by.B 2UTF \-h(used by char-maps from.B WG15locales) is assumed. Aliases specified in locale char-map files are recognized.Lines beginning with.B %or.B #and lines not matching formatare ignored. In case of duplicated lines the last line takes precedence." 0x%x 0x%X ".BR sscanf (3)format string is always assumedfor char-map files ending in.IR .TXT ,.I .Xor.IR .x .This corresponds to char-maps you can get from.BR ftp://ftp.unicode.org..TP.B \-o "" \-\-forward.B (default)converts *to* .BR unicode (7)if invoked as.BR 2UTF,and tries convert *from*.BR unicode (7)if invoked as.BR fromUTF..TP.B \-H "" \-\-HTML \-\-htmlThis applies to approximations when converting from.BR unicode (7).Special HTML characters appearing after approximations are changedto < > " and &..TP.B \-h "" \-? "" \-\-? "" \-\-help "" \-helpPrints the program's version number, default parameters,and a short usage message to the program's standard error output and exits..TP.B \-i only "" \-\-iconv=onlyDon't read configuration file, don't use built in charmap paths and use only.BR iconv (3)for conversion..TP.B \-i first "" \-\-iconv=firstAttempt to use.BR iconv (3)before charmap files for conversion. Internal approximations are always used when output char-set is 'US-ASCII'..TP.B \-i last "" \-\-iconv=lastAttempt to use charmap files before.BR iconv (3)for conversion..TP.B \-l "" \-\-list-charmapsLists char-maps and aliases currently in aliases database, then exits.This includes only char-maps usable by.BR 2UTF . .TP.B \-p "" \-\-pathnamesPrints pathnames for configuration file, default compiled-in directories forchar-map files, actually used directories for char-map files,pathname for aliases cache..TP.B \-r "" \-\-reversetries convert back *from*.BR unicode (7)if invoked as.BR 2UTF,and converts *to* .BR unicode (7)if invoked as.BR fromUTF..TP.B \-W "" \-\-show\-charmapoutputs table of char-map characters in.BR utf\-8 (7)encoding. .B .(period) is substituted for 0x0000-0x001F and 0x007F..B ?(question mark) is substituted for undefined characters..TP.B \-S "" \-\-spit\-glyphsoutputs table of characters in.BR utf\-8 (7)encoding at F000-F1FF.BR unicode (7)private use area. This corresponds to current console font in Linux..TP\fB\-S\fR[\fImin\fR]\fR[\fB\-\fR]\fR[\fImax\fR] \fB\-\-spit\-glyphs\fB=\fR[\fImin\fR]\fR[\fB\-\fR]\fR[\fImax\fR]outputs table of characters in.BR utf\-8 (7)encoding at given range..I minand.I maxis.BR unicode (7)hex numbers from 0 to 7FFFFFFF..I mindefaults to 0..I maxdefaults to.I min+ 511 if.B \-is present..TP.B \-s "" \-\-switch\-to\-UTF\-8tries to switch to.BR utf\-8 (7)mode by writing <ESC>%G to the program's standard error output. Use.B echo -ne '\\\033%@'to switch back if required. This doesn't work on all terminals..TP\fB\-u\fR[\fIX\fR] \fB\-\-unknown\-char\fR[\fB=\fIX\fR]\fPOutdated option.Substitute.I Xfor unknown single byte characters and errors.If.I Xisn't specified the defaultcharacter as output by.B 2UTF \-his assumed..I Xcan be a single character, hex (0x80), octal (0200) ordecimal (128) number. This can be useful when translatingto single-byte encoding..TP.B \-v "" \-\-verboseverbose mode..TP.B \-V "" \-\-versionshows program's version and some copyright information..PP Rightmost option or alias takes precedence. Long options may be abbreviated. Short options may be grouped..LPDefault.BR unicode (7)character for errors and unknown characters is 0xFFFD.Approximations can be performed if conversion is from Unicode to single-byte legacychar-sets. US-ASCII strings up to 4 bytes length is substituted for charactersundefined in the output char-set. These strings are defined at the compile time..SH EXAMPLES.PP To view ISO_8859-3:1988 document use:.PP.BI "2UTF --verbose --switch-to-UTF-8 8859-3 <" document " | less -r".PP To translate from CP1257 to BALTIC (ISO-IR-179) use:.PP.BI "2UTF -2 1257 <" cp1257_file " | fromUTF -2 baltic >" baltic_charset_file.PP To call a BBS using 869 "code page" use:.PP.B minicom -l -t linux |2UTF --switch-to-UTF-8 IBM869.PP To convert everything from UTF-8 to US-ASCII:.PP.BI "fromUTF us-ascii <" UTF-8_file.PP See also.I examplessubdirectory..SH FILESThere can be self-explanatory configuration file.IR 2UTF.config .It is searched in /usr/local/etc/, /usr/etc/, /etc/ or other directoriesdefined at compile time. Configuration file can specify directory names for char-map filesand external filters for conversion to and from other legacy char-sets andencodings not supported by.BR iconv (3)..SH "SEE ALSO".BR 2UTF (1),.BR iconv (1),.BR iconv (3),.BR tcs (1),.B recodeinfo page.PP.B Yuditeditor and converter at.B http://www.yudit.org/.PP.B ftp://ftp.cnd.org/pub/ifcss.org/software/unix/convert/.PP.RB ' trans 'program at.B ftp://ftp.funet.fi/pub/doc/charsets/.PPOn BSD systems:.BR utf2 (4),.BR multibyte (3),.PPOn Linux:.BR unicode (7),.BR utf\-8 (7),.BR console_codes (4),.BR charsets (4).PPLook at.B ftp://ftp.unicode.org/and.B ftp://dkuug.dk/i18n/WG15-collection/charmaps/for char-map files..SH URL.PD 0.PP http://x-lt.richard.eu.org/me/rch/ll.html#2UTF.PD 1.SH BUG REPORTS.PD 0.PP Bug reports, comments and suggestions please send to:./" .RS.PP.B Ricardas Cepas <rch@richard.eu.org>or.B <rch@WriteMe.Com>.PD 1.SH BUGS.PD 0.PPDue to the.BR popen (3)bug in older Linux glibc versions non-existent commands in configuration file are not detected.So please check configuration file by hand..PPTransformation from UTF-8 can be slow..PPCharacters can be lost if char-map files used are incomplete ..PPReverse transformation is not perfect..PPSee also.I TO-DOfile..PPPlease use at.B your own risk only..SH COPYING This program (including this man page) is distributed under.B BSD style license(see.I BSD_style_licensefile in the documentation directory)or.B GNU General Public License V2except.IR hdr.h ", " plan9.h " and " utf.cfiles (if used) from.BR tcs (1),public domain code from.I mimedecode.cfile. Copyright statements should bekept unchanged..PP For the copyright information see file.IR copyright .
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -