📄 sms中用unicode编码发送中文.htm
字号:
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><FONT
size=3><SPAN lang=EN-US>
SMS</SPAN><SPAN style="FONT-FAMILY: 宋体">是由</SPAN><SPAN
lang=EN-US>Esti </SPAN><SPAN
style="FONT-FAMILY: 宋体">所制定的一个规范(</SPAN><SPAN lang=EN-US>GSM
03.40 </SPAN><SPAN style="FONT-FAMILY: 宋体">和</SPAN><SPAN
lang=EN-US> GSM 03.38</SPAN><SPAN
style="FONT-FAMILY: 宋体">)。有两种方式来发送和接收</SPAN><SPAN
lang=EN-US>SMS</SPAN><SPAN
style="FONT-FAMILY: 宋体">消息:文本模式或者</SPAN><SPAN
lang=EN-US>PDU</SPAN><SPAN
style="FONT-FAMILY: 宋体">(</SPAN><SPAN lang=EN-US>protocol
description unit</SPAN><SPAN
style="FONT-FAMILY: 宋体">)模式。文本模式只能发送普通的</SPAN><SPAN
lang=EN-US>ASCII</SPAN><SPAN
style="FONT-FAMILY: 宋体">字符,而要发送图片、铃声、其它编码的字符(如中文)就必须采用</SPAN><SPAN
lang=EN-US>PDU</SPAN><SPAN
style="FONT-FAMILY: 宋体">模式。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><FONT
size=3><SPAN lang=EN-US>
PDU</SPAN><SPAN
style="FONT-FAMILY: 宋体">模式中,可以采用三种编码方式来编码要发送的内容,分别是</SPAN><SPAN
lang=EN-US> 7-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码、</SPAN><SPAN
lang=EN-US>8-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码、</SPAN><SPAN
lang=EN-US>16-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码。</SPAN><SPAN
lang=EN-US>7-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码用于发送普通的</SPAN><SPAN
lang=EN-US>ASCII</SPAN><SPAN
style="FONT-FAMILY: 宋体">字符;</SPAN><SPAN
lang=EN-US>8-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码通常用于发送数据消息,比如图片和铃声等;而</SPAN><SPAN
lang=EN-US>16-bit</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码用于发送</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">字符。在这三种编码方式下,可以发送的最大字符数分别是</SPAN><SPAN
lang=EN-US> 160</SPAN><SPAN
style="FONT-FAMILY: 宋体">、</SPAN><SPAN lang=EN-US>
140</SPAN><SPAN style="FONT-FAMILY: 宋体">、</SPAN><SPAN
lang=EN-US> 70</SPAN><SPAN
style="FONT-FAMILY: 宋体">。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><FONT
size=3><SPAN lang=EN-US>
</SPAN><SPAN
style="FONT-FAMILY: 宋体">若要发送中文(或日文等),必须采用</SPAN><SPAN
lang=EN-US>PDU</SPAN><SPAN
style="FONT-FAMILY: 宋体">模式的</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码方式。</SPAN></FONT></P>
<P class=MsoNormal
style="MARGIN: 0cm 0cm 0pt; TEXT-INDENT: 18pt"><FONT
size=3><SPAN style="FONT-FAMILY: 宋体">我最近参与了一个在</SPAN><SPAN
lang=EN-US>linux</SPAN><SPAN
style="FONT-FAMILY: 宋体">下收发短信的项目。其中,需要实现中文的发送和接收。由于原来没有中文编码、</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码的经验,所以查了一些资料,也在一些论坛上提了一些问题。现在把它整理出来,希望对以后再做类似项目的朋友有个帮助。我写的比较简单,关于</SPAN><SPAN
lang=EN-US>PDU</SPAN><SPAN
style="FONT-FAMILY: 宋体">的规范,可以看这里:</SPAN><SPAN lang=EN-US><A
href="http://www.ascend-tech.com.cn/sustain/SMS_PDU-mode.pdf">http://www.ascend-tech.com.cn/sustain/SMS_PDU-mode.pdf</A>
</SPAN><SPAN style="FONT-FAMILY: 宋体">,或者去</SPAN><SPAN
lang=EN-US>wavecom</SPAN><SPAN
style="FONT-FAMILY: 宋体">的网站上找找看。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal
style="MARGIN: 0cm 0cm 0pt 18pt; TEXT-INDENT: -18pt"><B><SPAN
lang=EN-US style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体">1、<SPAN
style="FONT: 7pt 'Times New Roman'"> </SPAN>GB2312 编码到Unicode
编码的转换</SPAN></B></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt"><B><SPAN
lang=EN-US
style="FONT-SIZE: 12pt; FONT-FAMILY: 宋体"></SPAN></B> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN style="FONT-FAMILY: 宋体">在</SPAN><SPAN lang=EN-US>
Redhat 7.3</SPAN><SPAN
style="FONT-FAMILY: 宋体">系统上,默认是用</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码保存中文字符的(对于中英文混合的文本也是如此)。所以首先需要把</SPAN><SPAN
lang=EN-US> GB2312 </SPAN><SPAN
style="FONT-FAMILY: 宋体">编码的字符串转换到</SPAN><SPAN lang=EN-US>
Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码的字符串。</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码是一种多字节编码方式,对于中文,用</SPAN><SPAN
lang=EN-US>2</SPAN><SPAN
style="FONT-FAMILY: 宋体">个字节表示,对于英文,用</SPAN><SPAN
lang=EN-US>1</SPAN><SPAN
style="FONT-FAMILY: 宋体">个字节表示,就是英文的</SPAN><SPAN
lang=EN-US>ascii</SPAN><SPAN
style="FONT-FAMILY: 宋体">码。(注:我没有仔细看过</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码的规范,以上理解是实际开发中得出来的,不能保证正确性)。</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码是双字节编码方式,对所有字符,都采用</SPAN><SPAN
lang=EN-US>2</SPAN><SPAN
style="FONT-FAMILY: 宋体">个字节编码。在</SPAN><SPAN
lang=EN-US>linux</SPAN><SPAN
style="FONT-FAMILY: 宋体">平台上,</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码到</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">编码的转换,可以有三种实现方式(或者更多):</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN lang=EN-US>1</SPAN><SPAN
style="FONT-FAMILY: 宋体">)、用</SPAN><SPAN lang=EN-US> mbstowcs
() </SPAN><SPAN
style="FONT-FAMILY: 宋体">函数。就是多字节编码到宽字符的转换。我试过它,可以正确的转换,但是这个函数可能不是很可靠。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN lang=EN-US>2</SPAN><SPAN
style="FONT-FAMILY: 宋体">)、用</SPAN><SPAN lang=EN-US> GB2312
</SPAN><SPAN lang=EN-US
style="FONT-FAMILY: Wingdings">à</SPAN><SPAN lang=EN-US>
Unicode </SPAN><SPAN
style="FONT-FAMILY: 宋体">的转换表,手动查表转换。网上有这样的转换表,你需要对每一个</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">字符,根据它是中文字符还是英文字符,分别转换。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN lang=EN-US>3</SPAN><SPAN
style="FONT-FAMILY: 宋体">)、用</SPAN><SPAN lang=EN-US> iconv ()
</SPAN><SPAN style="FONT-FAMILY: 宋体">函数。这可能是</SPAN><SPAN
lang=EN-US>linux</SPAN><SPAN
style="FONT-FAMILY: 宋体">上的标准的方法,不仅可以转换</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体">到</SPAN><SPAN
lang=EN-US>Unicode</SPAN><SPAN
style="FONT-FAMILY: 宋体">,还可以在任意的两种编码之间转换(前提是</SPAN><SPAN
lang=EN-US>linux</SPAN><SPAN
style="FONT-FAMILY: 宋体">系统要支持这些编码)。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN style="FONT-FAMILY: 宋体">首先要用</SPAN><SPAN
lang=EN-US> iconv_open()</SPAN><SPAN
style="FONT-FAMILY: 宋体">,</SPAN> <SPAN
style="FONT-FAMILY: 宋体">打开一个转换句柄,指定两种转换前的编码和转换后的编码。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><FONT
size=3><SPAN style="FONT-FAMILY: 宋体">然后用</SPAN><SPAN
lang=EN-US> icnov() </SPAN><SPAN
style="FONT-FAMILY: 宋体">作转换。最后用</SPAN><SPAN lang=EN-US>
iconv_close()</SPAN><SPAN
style="FONT-FAMILY: 宋体">关闭句柄,释放资源。</SPAN></FONT></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><FONT size=3></FONT></SPAN> </P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><!--[if gte vml 1]><V:SHAPETYPE
id=_x0000_t202 path="m,l,21600r21600,l21600,xe" o:spt="202"
coordsize="21600,21600"><V:STROKE joinstyle="miter" /><V:PATH
o:connecttype="rect" gradientshapeok="t"
/></V:SHAPETYPE><V:SHAPE id=_x0000_s1025
style="MARGIN-TOP: 0px; Z-INDEX: 1; MARGIN-LEFT: 18pt; WIDTH: 387pt; POSITION: absolute; HEIGHT: 319.5pt; mso-position-vertical: absolute"
fillcolor="silver" type="#_x0000_t202"><V:TEXTBOX
style="mso-next-textbox: #_x0000_s1025">
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>#include <iconv.h></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><O:P></O:P></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>#define BUFLEN 200</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>char inbuf[BUFLEN];</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>char outbuf[BUFLEN];</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>char* pin = inbuf;</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>char* pout = outbuf;</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><O:P></O:P></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>…</SPAN><SPAN
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">打开文件,读入</SPAN><SPAN
lang=EN-US>GB2312</SPAN><SPAN
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">数据到</SPAN><SPAN
lang=EN-US>inbuf</SPAN><SPAN
style="FONT-FAMILY: 宋体; mso-ascii-font-family: 'Times New Roman'; mso-hansi-font-family: 'Times New Roman'">,数据长度为</SPAN><SPAN
lang=EN-US> len</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US><O:P></O:P></SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
lang=EN-US>int inleft = len;</SPAN></P>
<P class=MsoNormal style="MARGIN: 0cm 0cm 0pt 18pt"><SPAN
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -