📄 lwptut.html
字号:
<a href="http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html">http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html</a></pre>
<p>If you access that with a browser, you'll get a prompt
like
"Enter username and password for 'Unicode-MailList-Archives' at server
'www.unicode.org'".</p>
<p>In LWP, if you just request that URL, like this:</p>
<pre>
<span class="keyword">use</span> <span class="variable">LWP</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$browser</span> <span class="operator">=</span> <span class="variable">LWP::UserAgent</span><span class="operator">-></span><span class="variable">new</span><span class="operator">;</span>
</pre>
<pre>
<span class="keyword">my</span> <span class="variable">$url</span> <span class="operator">=</span>
<span class="string">'http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html'</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$response</span> <span class="operator">=</span> <span class="variable">$browser</span><span class="operator">-></span><span class="variable">get</span><span class="operator">(</span><span class="variable">$url</span><span class="operator">);</span>
</pre>
<pre>
<span class="keyword">die</span> <span class="string">"Error: "</span><span class="operator">,</span> <span class="variable">$response</span><span class="operator">-></span><span class="variable">header</span><span class="operator">(</span><span class="string">'WWW-Authenticate'</span><span class="operator">)</span> <span class="operator">||</span> <span class="string">'Error accessing'</span><span class="operator">,</span>
<span class="comment"># ('WWW-Authenticate' is the realm-name)</span>
<span class="string">"\n "</span><span class="operator">,</span> <span class="variable">$response</span><span class="operator">-></span><span class="variable">status_line</span><span class="operator">,</span> <span class="string">"\n at $url\n Aborting"</span>
<span class="keyword">unless</span> <span class="variable">$response</span><span class="operator">-></span><span class="variable">is_success</span><span class="operator">;</span>
</pre>
<p>Then you'll get this error:</p>
<pre>
Error: Basic realm="Unicode-MailList-Archives"
401 Authorization Required
at <a href="http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html">http://www.unicode.org/mail-arch/unicode-ml/y2002-m08/0067.html</a>
Aborting at auth1.pl line 9. [or wherever]</pre>
<p>...because the <code>$browser</code> doesn't know any the username and password
for that realm ("Unicode-MailList-Archives") at that host
("www.unicode.org"). The simplest way to let the browser know about this
is to use the <code>credentials</code> method to let it know about a username and
password that it can try using for that realm at that host. The syntax is:</p>
<pre>
<span class="variable">$browser</span><span class="operator">-></span><span class="variable">credentials</span><span class="operator">(</span>
<span class="string">'servername:portnumber'</span><span class="operator">,</span>
<span class="string">'realm-name'</span><span class="operator">,</span>
<span class="string">'username'</span> <span class="operator">=></span> <span class="string">'password'</span>
<span class="operator">);</span>
</pre>
<p>In most cases, the port number is 80, the default TCP/IP port for HTTP; and
you usually call the <code>credentials</code> method before you make any requests.
For example:</p>
<pre>
<span class="variable">$browser</span><span class="operator">-></span><span class="variable">credentials</span><span class="operator">(</span>
<span class="string">'reports.mybazouki.com:80'</span><span class="operator">,</span>
<span class="string">'web_server_usage_reports'</span><span class="operator">,</span>
<span class="string">'plinky'</span> <span class="operator">=></span> <span class="string">'banjo123'</span>
<span class="operator">);</span>
</pre>
<p>So if we add the following to the program above, right after the <code><
$browser = LWP::UserAgent-</code>new; >> line...</p>
<pre>
<span class="variable">$browser</span><span class="operator">-></span><span class="variable">credentials</span><span class="operator">(</span> <span class="comment"># add this to our $browser 's "key ring"</span>
<span class="string">'www.unicode.org:80'</span><span class="operator">,</span>
<span class="string">'Unicode-MailList-Archives'</span><span class="operator">,</span>
<span class="string">'unicode-ml'</span> <span class="operator">=></span> <span class="string">'unicode'</span>
<span class="operator">);</span>
</pre>
<p>...then when we run it, the request succeeds, instead of causing the
<a href="../lib/Pod/perlfunc.html#item_die"><code>die</code></a> to be called.</p>
<p>
</p>
<h2><a name="accessing_https_urls">Accessing HTTPS URLs</a></h2>
<p>When you access an HTTPS URL, it'll work for you just like an HTTP URL
would -- if your LWP installation has HTTPS support (via an appropriate
Secure Sockets Layer library). For example:</p>
<pre>
<span class="keyword">use</span> <span class="variable">LWP</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$url</span> <span class="operator">=</span> <span class="string">'https://www.paypal.com/'</span><span class="operator">;</span> <span class="comment"># Yes, HTTPS!</span>
<span class="keyword">my</span> <span class="variable">$browser</span> <span class="operator">=</span> <span class="variable">LWP::UserAgent</span><span class="operator">-></span><span class="variable">new</span><span class="operator">;</span>
<span class="keyword">my</span> <span class="variable">$response</span> <span class="operator">=</span> <span class="variable">$browser</span><span class="operator">-></span><span class="variable">get</span><span class="operator">(</span><span class="variable">$url</span><span class="operator">);</span>
<span class="keyword">die</span> <span class="string">"Error at $url\n "</span><span class="operator">,</span> <span class="variable">$response</span><span class="operator">-></span><span class="variable">status_line</span><span class="operator">,</span> <span class="string">"\n Aborting"</span>
<span class="keyword">unless</span> <span class="variable">$response</span><span class="operator">-></span><span class="variable">is_success</span><span class="operator">;</span>
<span class="keyword">print</span> <span class="string">"Whee, it worked! I got that "</span><span class="operator">,</span>
<span class="variable">$response</span><span class="operator">-></span><span class="variable">content_type</span><span class="operator">,</span> <span class="string">" document!\n"</span><span class="operator">;</span>
</pre>
<p>If your LWP installation doesn't have HTTPS support set up, then the
response will be unsuccessful, and you'll get this error message:</p>
<pre>
Error at https://www.paypal.com/
501 Protocol scheme 'https' is not supported
Aborting at paypal.pl line 7. [or whatever program and line]</pre>
<p>If your LWP installation <em>does</em> have HTTPS support installed, then the
response should be successful, and you should be able to consult
<code>$response</code> just like with any normal HTTP response.</p>
<p>For information about installing HTTPS support for your LWP
installation, see the helpful <em>README.SSL</em> file that comes in the
libwww-perl distribution.</p>
<p>
</p>
<h2><a name="getting_large_documents">Getting Large Documents</a></h2>
<p>When you're requesting a large (or at least potentially large) document,
a problem with the normal way of using the request methods (like <code><
$response = $browser-</code><code>get($url)</code> >>) is that the response object in
memory will have to hold the whole document -- <em>in memory</em>. If the
response is a thirty megabyte file, this is likely to be quite an
imposition on this process's memory usage.</p>
<p>A notable alternative is to have LWP save the content to a file on disk,
instead of saving it up in memory. This is the syntax to use:</p>
<pre>
<span class="variable">$response</span> <span class="operator">=</span> <span class="variable">$ua</span><span class="operator">-></span><span class="variable">get</span><span class="operator">(</span><span class="variable">$url</span><span class="operator">,</span>
<span class="string">':content_file'</span> <span class="operator">=></span> <span class="variable">$filespec</span><span class="operator">,</span>
<span class="operator">);</span>
</pre>
<p>For example,</p>
<pre>
<span class="variable">$response</span> <span class="operator">=</span> <span class="variable">$ua</span><span class="operator">-></span><span class="variable">get</span><span class="operator">(</span><span class="string">'http://search.cpan.org/'</span><span class="operator">,</span>
<span class="string">':content_file'</span> <span class="operator">=></span> <span class="string">'/tmp/sco.html'</span>
<span class="operator">);</span>
</pre>
<p>When you use this <code>:content_file</code> option, the <code>$response</code> will have
all the normal header lines, but <code>$response->content</code> will be
empty.</p>
<p>Note that this ":content_file" option isn't supported under older
versions of LWP, so you should consider adding <code>use LWP 5.66;</code> to check
the LWP version, if you think your program might run on systems with
older versions.</p>
<p>If you need to be compatible with older LWP versions, then use
this syntax, which does the same thing:</p>
<pre>
<span class="keyword">use</span> <span class="variable">HTTP::Request::Common</span><span class="operator">;</span>
<span class="variable">$response</span> <span class="operator">=</span> <span class="variable">$ua</span><span class="operator">-></span><span class="variable">request</span><span class="operator">(</span> <span class="variable">GET</span><span class="operator">(</span><span class="variable">$url</span><span class="operator">),</span> <span class="variable">$filespec</span> <span class="operator">);</span>
</pre>
<p>
</p>
<hr />
<h1><a name="see_also">SEE ALSO</a></h1>
<p>Remember, this article is just the most rudimentary introduction to
LWP -- to learn more about LWP and LWP-related tasks, you really
must read from the following:</p>
<ul>
<li>
<p><a href="../lib/LWP/Simple.html">the LWP::Simple manpage</a> -- simple functions for getting/heading/mirroring URLs</p>
</li>
<li>
<p><a href="../lib/LWP.html">the LWP manpage</a> -- overview of the libwww-perl modules</p>
</li>
<li>
<p><a href="../lib/LWP/UserAgent.html">the LWP::UserAgent manpage</a> -- the class for objects that represent "virtual browsers"</p>
</li>
<li>
<p><a href="../lib/HTTP/Response.html">the HTTP::Response manpage</a> -- the class for objects that represent the response to
a LWP response, as in <code>$response = $browser->get(...)</code></p>
</li>
<li>
<p><a href="../lib/HTTP/Message.html">the HTTP::Message manpage</a> and <a href="../lib/HTTP/Headers.html">the HTTP::Headers manpage</a> -- classes that provide more methods
to HTTP::Response.</p>
</li>
<li>
<p><a href="../lib/URI.html">the URI manpage</a> -- class for objects that represent absolute or relative URLs</p>
</li>
<li>
<p><a href="../lib/URI/Escape.html">the URI::Escape manpage</a> -- functions for URL-escaping and URL-unescaping strings
(like turning "this & that" to and from "this%20%26%20that").</p>
</li>
<li>
<p><a href="../lib/HTML/Entities.html">the HTML::Entities manpage</a> -- functions for HTML-escaping and HTML-unescaping strings
(like turning "C. & E. Brontë" to and from "C. &amp; E. Bront&euml;")</p>
</li>
<li>
<p><a href="../lib/HTML/TokeParser.html">the HTML::TokeParser manpage</a> and <a href="../lib/HTML/TreeBuilder.html">the HTML::TreeBuilder manpage</a> -- classes for parsing HTML</p>
</li>
<li>
<p><a href="../lib/HTML/LinkExtor.html">the HTML::LinkExtor manpage</a> -- class for finding links in HTML documents</p>
</li>
<li>
<p>The book <em>Perl & LWP</em> by Sean M. Burke. O'Reilly & Associates, 2002.
ISBN: 0-596-00178-9. <code>http://www.oreilly.com/catalog/perllwp/</code></p>
</li>
</ul>
<p>
</p>
<hr />
<h1><a name="copyright">COPYRIGHT</a></h1>
<p>Copyright 2002, Sean M. Burke. You can redistribute this document and/or
modify it, but only under the same terms as Perl itself.</p>
<p>
</p>
<hr />
<h1><a name="author">AUTHOR</a></h1>
<p>Sean M. Burke <code>sburke@cpan.org</code></p>
</body>
</html>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -