⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 ch09_02.htm

📁 用perl编写CGI的好书。本书从解释CGI和底层HTTP协议如何工作开始
💻 HTM
📖 第 1 页 / 共 2 页
字号:
<?label 9.2. Email Addresses?><html><head><title>Email Addresses (CGI Programming with Perl)</title><link href="../style/style1.css" type="text/css" rel="stylesheet" /><meta name="DC.Creator" content="Scott Guelich, Gunther Birznieks and Shishir Gundavaram" /><meta scheme="MIME" content="text/xml" name="DC.Format" /><meta content="en-US" name="DC.Language" /><meta content="O'Reilly & Associates, Inc." name="DC.Publisher" /><meta scheme="ISBN" name="DC.Source" content="1565924193L" /><meta name="DC.Subject.Keyword" content="stuff" /><meta name="DC.Title" content="CGI Programming with Perl" /><meta content="Text.Monograph" name="DC.Type" /></head><body bgcolor="#ffffff"><img src="gifs/smbanner.gif" alt="Book Home" usemap="#banner-map" border="0" /><map name="banner-map"><area alt="CGI Programming with Perl" href="index.htm" coords="0,0,466,65" shape="rect" /><area alt="Search this book" href="jobjects/fsearch.htm" coords="467,0,514,18" shape="rect" /></map><div class="navbar"><table border="0" width="515"><tr><td width="172" valign="top" align="left"><a href="ch09_01.htm"><img src="../gifs/txtpreva.gif" alt="Previous" border="0" /></a></td><td width="171" valign="top" align="center"><a href="index.htm">CGI Programming with Perl</a></td><td width="172" valign="top" align="right"><a href="ch09_03.htm"><img src="../gifs/txtnexta.gif" alt="Next" border="0" /></a></td></tr></table></div><hr align="left" width="515" /><h2 class="sect1">9.2. Email Addresses</h2><p>Part<a name="INDEX-1830" /><a name="INDEX-1831" /><a name="INDEX-1832" />of handlingmail includes handling email addresses. Collecting email addressesfrom users seems to be part of almost any<a name="INDEX-1833" />registrationform on the Web.<a href="#FOOTNOTE-17">[17]</a> You may wonder how you can know whether an email addressentered into a form is<a name="INDEX-1834" />valid. The simple answer, of course,is that you can't. You can validate that the email address issyntactically valid (although this is considerably more difficultthan you might expect), but you cannot know whether the email addressactually corresponds to a valid account or not.</p><blockquote><a name="FOOTNOTE-17" /><p>[17]This isn't necessarily a goodthing. Many sites have adopted the common practice of requiring anemail address for accessing otherwise free services. These sitesoften allow the user to check a checkbox to be exempted from massmailings, but if this is optional, then why is entering an emailaddress not optional? If you are asked to create forms like this,please ask yourself and your sponsors why you are collecting privateinformation. If you have a good reason, then explain it on yourregistration form. If not, then there is no reason to collect morethan you need; user privacy should not be an afterthought.</p></blockquote><p>You may think you should be able to make a query to an<a name="INDEX-1835" />SMTP server to check whether anemail address is valid or not. In fact, the <a name="INDEX-1836" />SMTP protocol supports a command tovalidate an email address. Unfortunately, this really cannot be usedin practice. There are two problems.</p><p>The first problem is that the SMTP server responsible for handlingthe mail for that email address may not always be accessible. Theremay be intermediate network outages, and even when the network isfine, mail servers are frequently overloaded and may refuse requests.These are not typically a problem for Internet mail because othermail servers trying to deliver to them maintain queues of messagesand retry several times, often for days, before giving up. However,if you need immediate verification, the mail server may not beavailable to give it to you.</p><p>The second problem is that even when the final SMTP server isavailable, it may not provide reliable information. Many SMTP serverssimply gateway messages to an internal mail system, which may speakanother protocol and be located on another network. Because of this,one of these SMTP gateways may not know which email addresses arevalid on the other network; it may simply be set up to forward allInternet mail. Therefore, when this SMTP server is asked to verify anemail address, it may state that any email address addressed to itsdomain is deliverable, whether it is or not.</p><p>The best that you can do if you need to validate an email address issend an actual email to that address and ask the user to respond. Wewill look at ways to write scripts to respond to email later in thischapter. For now lets look at how to recognize syntactically validemail addresses.</p><a name="ch09-4-fm2xml" /><div class="sect2"><h3 class="sect2">9.2.1. Validating Syntax</h3><p>A<a name="INDEX-1837" /> <a name="INDEX-1,838" /><a name="INDEX-1839" />common question that new CGIdevelopers ask is what the <a name="INDEX-1840" />regular expression for matchingemail addresses looks like. If you ask around, some people will referyou to a book called <em class="citetitle">Mastering RegularExpressions</em> by Jeffrey Friedl (O'Reilly &amp;Associates, Inc.). Others might give you a simple expression thatchecks for "@" and that checks that the domain name endsin a dot and two or three letters. In fact, neither of these answersis fully accurate.</p><p>To understand why, let's review a little history. The standarddocument for defining email address names is RFC 822. It waspublished in 1982. Does that seem like a long time ago to you? Itshould. The Internet was radically different then. In fact, itwasn't called the Internet then -- it was a collection ofmany different networks, including ARPAnet, Bitnet, and CSNET, eachwith their own naming conventions. TCP/IP was being introduced as anew networking protocol and hosts only numbered in the hundreds. Itwasn't until 1983 that serious work began on implementingdomain name servers. The hierarchical names that we recognize todaylike <em class="emphasis">www.oreilly.com</em> did notexist back then.</p><p>So that is half of the story. The other half of the story is thatJeffrey Friedl, in his book <em class="citetitle">Mastering RegularExpressions</em>, tackled creating a regular expression tohandle the parsing of RFC 822 email addresses. The book is the bestreference for understanding regular expressions in Perl or any othercontext. Many people cite the regular expression he constructs as theonly definitive test of whether an Internet email address is valid.But unfortunately these people have misunderstood what it does; ittests for compliance with RFC 822. According to RFC 822, these areall syntactically valid email addresses:</p><blockquote><pre class="code">Alfred Neuman &lt;Neuman@BBN-TENEXA&gt;":sysmail"@  Some-Group. Some-OrgMuhammed.(I am  the greatest) Ali @(the)Vegas.WBA</pre></blockquote><p>Do any of them look like the type of email address you'd wantto capture in an HTML form? It is true that RFC 822 has not beensuperseded by another RFC and is still a standard, but it is equallytrue that the problem we are trying to solve is radically differentin time and context from the problem that it solved in 1982.</p><p>We want an expression to recognize a syntactically valid emailaddress as required on the Internet today. We are interested only intoday's standard Internet domain-naming convention. That wouldactually rule out all of the above addresses, since none of them endin one of our current top level domains (<em class="emphasis">.com</em>,<em class="emphasis">.net</em>, <em class="emphasis">.edu</em>,<em class="emphasis">.uk</em>, etc.). There are other importantdistinctions.</p><p>The first example is a full email address including a name and whatRFC 822 refers to as the <em class="firstterm">addressspecification</em><a name="INDEX-1841" /> in angled brackets. You mayhave seen this expanded syntax in your email software. We do notneed, and probably don't want, this additional information inan email address captured in a form. In all likelihood, theuser's name is being captured separately in other fields. Whenwe need to validate an email address that a user has entered, we aregenerally only interested in the address specification itself. Sohenceforth when we refer to an email address, we are simply referringto this address specification, the <em class="emphasis">user@hostname</em> part.</p><p>The second example contains a quoted element (any group of charactersseparated by a "." or a "@" we will refer toas an <em class="firstterm">element</em> <a href="#FOOTNOTE-18">[18]</a>). Quoted<a name="INDEX-1842" /><a name="INDEX-1843" />elements are completely acceptableand still work fine on today's Internet. If you want to accept

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -