perlfaq9.pod
来自「ARM上的如果你对底层感兴趣」· POD 代码 · 共 553 行 · 第 1/2 页
POD
553 行
=head1 NAME
perlfaq9 - Networking ($Revision: 1.20 $, $Date: 1998/06/22 18:31:09 $)
=head1 DESCRIPTION
This section deals with questions related to networking, the internet,
and a few on the web.
=head2 My CGI script runs from the command line but not the browser. (500 Server Error)
If you can demonstrate that you've read the following FAQs and that
your problem isn't something simple that can be easily answered, you'll
probably receive a courteous and useful reply to your question if you
post it on comp.infosystems.www.authoring.cgi (if it's something to do
with HTTP, HTML, or the CGI protocols). Questions that appear to be Perl
questions but are really CGI ones that are posted to comp.lang.perl.misc
may not be so well received.
The useful FAQs and related documents are:
CGI FAQ
http://www.webthing.com/page.cgi/cgifaq
Web FAQ
http://www.boutell.com/faq/
WWW Security FAQ
http://www.w3.org/Security/Faq/
HTTP Spec
http://www.w3.org/pub/WWW/Protocols/HTTP/
HTML Spec
http://www.w3.org/TR/REC-html40/
http://www.w3.org/pub/WWW/MarkUp/
CGI Spec
http://www.w3.org/CGI/
CGI Security FAQ
http://www.go2net.com/people/paulp/cgi-security/safe-cgi.txt
=head2 How can I get better error messages from a CGI program?
Use the CGI::Carp module. It replaces C<warn> and C<die>, plus the
normal Carp modules C<carp>, C<croak>, and C<confess> functions with
more verbose and safer versions. It still sends them to the normal
server error log.
use CGI::Carp;
warn "This is a complaint";
die "But this one is serious";
The following use of CGI::Carp also redirects errors to a file of your choice,
placed in a BEGIN block to catch compile-time warnings as well:
BEGIN {
use CGI::Carp qw(carpout);
open(LOG, ">>/var/local/cgi-logs/mycgi-log")
or die "Unable to append to mycgi-log: $!\n";
carpout(*LOG);
}
You can even arrange for fatal errors to go back to the client browser,
which is nice for your own debugging, but might confuse the end user.
use CGI::Carp qw(fatalsToBrowser);
die "Bad error here";
Even if the error happens before you get the HTTP header out, the module
will try to take care of this to avoid the dreaded server 500 errors.
Normal warnings still go out to the server error log (or wherever
you've sent them with C<carpout>) with the application name and date
stamp prepended.
=head2 How do I remove HTML from a string?
The most correct way (albeit not the fastest) is to use HTML::Parse
from CPAN (part of the libwww-perl distribution, which is a must-have
module for all web hackers).
Many folks attempt a simple-minded regular expression approach, like
C<s/E<lt>.*?E<gt>//g>, but that fails in many cases because the tags
may continue over line breaks, they may contain quoted angle-brackets,
or HTML comment may be present. Plus folks forget to convert
entities, like C<<> for example.
Here's one "simple-minded" approach, that works for most files:
#!/usr/bin/perl -p0777
s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
If you want a more complete solution, see the 3-stage striphtml
program in
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/striphtml.gz
.
Here are some tricky cases that you should think about when picking
a solution:
<IMG SRC = "foo.gif" ALT = "A > B">
<IMG SRC = "foo.gif"
ALT = "A > B">
<!-- <A comment> -->
<script>if (a<b && a>c)</script>
<# Just data #>
<![INCLUDE CDATA [ >>>>>>>>>>>> ]]>
If HTML comments include other tags, those solutions would also break
on text like this:
<!-- This section commented out.
<B>You can't see me!</B>
-->
=head2 How do I extract URLs?
A quick but imperfect approach is
#!/usr/bin/perl -n00
# qxurl - tchrist@perl.com
print "$2\n" while m{
< \s*
A \s+ HREF \s* = \s* (["']) (.*?) \1
\s* >
}gsix;
This version does not adjust relative URLs, understand alternate
bases, deal with HTML comments, deal with HREF and NAME attributes in
the same tag, or accept URLs themselves as arguments. It also runs
about 100x faster than a more "complete" solution using the LWP suite
of modules, such as the
http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/xurl.gz
program.
=head2 How do I download a file from the user's machine? How do I open a file on another machine?
In the context of an HTML form, you can use what's known as
B<multipart/form-data> encoding. The CGI.pm module (available from
CPAN) supports this in the start_multipart_form() method, which isn't
the same as the startform() method.
=head2 How do I make a pop-up menu in HTML?
Use the B<E<lt>SELECTE<gt>> and B<E<lt>OPTIONE<gt>> tags. The CGI.pm
module (available from CPAN) supports this widget, as well as many
others, including some that it cleverly synthesizes on its own.
=head2 How do I fetch an HTML file?
One approach, if you have the lynx text-based HTML browser installed
on your system, is this:
$html_code = `lynx -source $url`;
$text_data = `lynx -dump $url`;
The libwww-perl (LWP) modules from CPAN provide a more powerful way to
do this. They work through proxies, and don't require lynx:
# simplest version
use LWP::Simple;
$content = get($URL);
# or print HTML from a URL
use LWP::Simple;
getprint "http://www.sn.no/libwww-perl/";
# or print ASCII from HTML from a URL
use LWP::Simple;
use HTML::Parse;
use HTML::FormatText;
my ($html, $ascii);
$html = get("http://www.perl.com/");
defined $html
or die "Can't fetch HTML from http://www.perl.com/";
$ascii = HTML::FormatText->new->format(parse_html($html));
print $ascii;
=head2 How do I automate an HTML form submission?
If you're submitting values using the GET method, create a URL and encode
the form using the C<query_form> method:
use LWP::Simple;
use URI::URL;
my $url = url('http://www.perl.com/cgi-bin/cpan_mod');
$url->query_form(module => 'DB_File', readme => 1);
$content = get($url);
If you're using the POST method, create your own user agent and encode
the content appropriately.
use HTTP::Request::Common qw(POST);
use LWP::UserAgent;
$ua = LWP::UserAgent->new();
my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod',
[ module => 'DB_File', readme => 1 ];
$content = $ua->request($req)->as_string;
=head2 How do I decode or create those %-encodings on the web?
Here's an example of decoding:
$string = "http://altavista.digital.com/cgi-bin/query?pg=q&what=news&fmt=.&q=%2Bcgi-bin+%2Bperl.exe";
$string =~ s/%([a-fA-F0-9]{2})/chr(hex($1))/ge;
Encoding is a bit harder, because you can't just blindly change
all the non-alphanumunder character (C<\W>) into their hex escapes.
It's important that characters with special meaning like C</> and C<?>
I<not> be translated. Probably the easiest way to get this right is
to avoid reinventing the wheel and just use the URI::Escape module,
which is part of the libwww-perl package (LWP) available from CPAN.
=head2 How do I redirect to another page?
Instead of sending back a C<Content-Type> as the headers of your
reply, send back a C<Location:> header. Officially this should be a
C<URI:> header, so the CGI.pm module (available from CPAN) sends back
both:
Location: http://www.domain.com/newpage
URI: http://www.domain.com/newpage
Note that relative URLs in these headers can cause strange effects
because of "optimizations" that servers do.
$url = "http://www.perl.com/CPAN/";
print "Location: $url\n\n";
exit;
To be correct to the spec, each of those C<"\n">
should really each be C<"\015\012">, but unless you're
stuck on MacOS, you probably won't notice.
=head2 How do I put a password on my web pages?
That depends. You'll need to read the documentation for your web
server, or perhaps check some of the other FAQs referenced above.
=head2 How do I edit my .htpasswd and .htgroup files with Perl?
The HTTPD::UserAdmin and HTTPD::GroupAdmin modules provide a
consistent OO interface to these files, regardless of how they're
stored. Databases may be text, dbm, Berkley DB or any database with a
DBI compatible driver. HTTPD::UserAdmin supports files used by the
`Basic' and `Digest' authentication schemes. Here's an example:
use HTTPD::UserAdmin ();
HTTPD::UserAdmin
->new(DB => "/foo/.htpasswd")
->add($username => $password);
=head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things?
Read the CGI security FAQ, at
http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html, and the
Perl/CGI FAQ at
http://www.perl.com/CPAN/doc/FAQs/cgi/perl-cgi-faq.html.
In brief: use tainting (see L<perlsec>), which makes sure that data
from outside your script (eg, CGI parameters) are never used in
C<eval> or C<system> calls. In addition to tainting, never use the
single-argument form of system() or exec(). Instead, supply the
command and arguments as a list, which prevents shell globbing.
=head2 How do I parse a mail header?
For a quick-and-dirty solution, try this solution derived
from page 222 of the 2nd edition of "Programming Perl":
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?