📄 ch23_01.htm
字号:
doing when you use them.</p><p><a name="INDEX-3982"></a>Sometimes, though, you can't tell how many arguments you're passing.If you supply these functions with an array<a href="#FOOTNOTE-2">[2]</a> that contains justone element, then it's just as though you passed one string in thefirst place, so the shell might be used. The solution is to pass anexplicit path in the indirect-object slot:<blockquote><pre class="programlisting">system @args; # Won't call the shell unless @args == 1.system { $args[0] } @args; # Bypasses shell even with one-argument list.</pre></blockquote></p><blockquote class="footnote"><a name="FOOTNOTE-2"></a><p>[2]Or afunction that produces a list.</p></blockquote><h3 class="sect2">23.1.1. Detecting and Laundering Tainted Data</h3><a name="INDEX-3983"></a><a name="INDEX-3984"></a><a name="INDEX-3985"></a><a name="INDEX-3986"></a><p>To test whether a scalar variable contains tainted data, you can usethe following <tt class="literal">is_tainted</tt> function. It makes use ofthe fact that <tt class="literal">eval</tt><em class="replaceable">STRING</em> raises an exception if you try tocompile tainted data. It doesn't matter that the<tt class="literal">$nada</tt> variable used in the expression to compilewill always be empty; it will still be tainted if<tt class="literal">$arg</tt> is tainted. The outer <tt class="literal">eval</tt><em class="replaceable">BLOCK</em> isn't doing any compilation. It'sjust there to catch the exception raised if the inner<tt class="literal">eval</tt> is given tainted data. Since the<tt class="literal">$@</tt> variable is guaranteed to be nonempty after each<tt class="literal">eval</tt> if an exception was raised and emptyotherwise, we return the result of testing whether its length waszero:<blockquote><pre class="programlisting">sub is_tainted { my $arg = shift; my $nada = substr($arg, 0, 0); # zero-length local $@; # preserve caller's version eval { eval "# $nada" }; return length($@) != 0;}</pre></blockquote>But testing for taintedness only gets you so far. Usually you knowperfectly well which variables contain tainted data--you just have toclear the data's taintedness. The only official way to bypass thetainting mechanism is by referencing submatches returned by an earlierregular expression match.<a href="#FOOTNOTE-3">[3]</a> When you write a pattern that containscapturing parentheses, you can access the captured substrings throughmatch variables like <tt class="literal">$1</tt>, <tt class="literal">$2</tt>, and<tt class="literal">$+</tt>, or by evaluating the pattern in listcontext. Either way, the presumption is that you knew what you weredoing when you wrote the pattern and wrote it to weed out anythingdangerous. So you need to give it some real thought--never blindlyuntaint, or else you defeat the entire mechanism.</p><blockquote class="footnote"><a name="FOOTNOTE-3"></a><p>[3]An unofficial way is bystoring the tainted string as the key to a hash and fetching back thatkey. Because keys aren't really full SVs (internal name scalarvalues), they don't carry the taint property. This behavior may bechanged someday, so don't rely on it. Be careful when handling keys,lest you unintentionally untaint your data and do something unsafewith them.</p></blockquote><p>It's better to verify that the variable contains only good charactersthan to check whether it contains any bad characters. That's becauseit's far too easy to miss bad characters that you never thought of.For example, here's a test to make sure <tt class="literal">$string</tt>contains nothing but "word" characters (alphabetics, numerics, andunderscores), hyphens, at signs, and dots:<blockquote><pre class="programlisting">if ($string =~ /^([-\@\w.]+)$/) { $string = $1; # $string now untainted.}else { die "Bad data in $string"; # Log this somewhere.}</pre></blockquote><a name="INDEX-3987"></a></p><p>This renders <tt class="literal">$string</tt> fairly secure to use later inan external command, since <tt class="literal">/\w+/</tt> doesn't normallymatch shell metacharacters, nor are those other characters going tomean anything special to the shell.<a href="#FOOTNOTE-4">[4]</a> Had we used <tt class="literal">/(.+)/s</tt>instead, it would have been unsafe because that pattern letseverything through. But Perl doesn't check for that. Whenuntainting, be exceedingly careful with your patterns. Launderingdata by using regular expressions is the <em class="emphasis">only</em>approved internal mechanism for untainting dirty data. And sometimesit's the wrong approach entirely. If you're in taint mode becauseyou're running set-id and not because you intentionally turned on<span class="option">-T</span>, you can reduce your risk by forking a child oflesser privilege; see the section <a href="ch23_01.htm#ch23-sect-cuye">Section 23.1.2, "Cleaning Up Your Environment"</a>.</p><blockquote class="footnote"><a name="FOOTNOTE-4"></a><p>[4] Unless you wereusing an intentionally broken locale. Perl assumes that your system'slocale definitions are potentially compromised. Hence, when runningunder the <tt class="literal">use locale</tt> pragma, patterns with asymbolic character class in them, such as <tt class="literal">\w</tt> or<tt class="literal">[[:alpha:]]</tt>, produce taintedresults.</p></blockquote><p>The <tt class="literal">use re 'taint'</tt> pragma disables the implicituntainting of any pattern matches through the end of the currentlexical scope. You might use this pragma if you just want to extracta few substrings from some potentially tainted data, but since youaren't being mindful of security, you'd prefer to leave the substringstainted to guard against unfortunate accidents later.</p><p>Imagine you're matching something like this, where<tt class="literal">$fullpath</tt> is tainted:<blockquote><pre class="programlisting">($dir, $file) = $fullpath =~ m!(.*/)(.*)!s;</pre></blockquote>By default, <tt class="literal">$dir</tt> and <tt class="literal">$file</tt> wouldnow be untainted. But you probably didn't want to do that socavalierly, because you never really thought about the securityissues. For example, you might not be terribly happy if<tt class="literal">$file</tt> contained the string "<tt class="literal">; rm -rf *;</tt>", just to name one rather egregious example. Thefollowing code leaves the two result variables tainted if<tt class="literal">$fullpath</tt> was tainted:<blockquote><pre class="programlisting">{ use re 'taint'; ($dir, $file) = $fullpath =~ m!(.*/)(.*)!s;}</pre></blockquote>A good strategy is to leave submatches tainted by default over thewhole source file and only selectively permit untainting in nestedscopes as needed:<blockquote><pre class="programlisting">use re 'taint';# remainder of file now leaves $1 etc tainted{ no re 'taint'; # this block now untaints re matches if ($num =~ /^(\d+)$/) { $num = $1; }}</pre></blockquote>Input from a filehandle or a directory handle is automaticallytainted, except when it comes from the special filehandle, <tt class="literal">DATA</tt>.If you want to, you can mark other handles as trusted sourcesvia the <tt class="literal">IO::Handle</tt> module's <tt class="literal">untaint</tt> function:<blockquote><pre class="programlisting">use IO::Handle;IO::Handle::untaint(*SOME_FH); # Either procedurallySOME_FH->untaint(); # or using the OO style.</pre></blockquote>Turning off tainting on an entire filehandle is a risky move. Howdo you <em class="emphasis">really</em> know it's safe? If you're going to do this, youshould at least verify that nobody but the owner can write to thefile.<a href="#FOOTNOTE-5">[5]</a> If you're on a Unix filesystem (andone that prudently restricts <em class="emphasis">chown</em>(2) to the superuser), thefollowing code works:<blockquote><pre class="programlisting">use File::stat;use Symbol 'qualify_to_ref';sub handle_looks_safe(*) { my $fh = qualify_to_ref(shift, caller); my $info = stat($fh); return unless $info; # owner neither superuser nor "me", whose # real uid is in the $< variable if ($info->uid != 0 && $info->uid != $<) { return 0; } # check whether group or other can write file. # use 066 to detect for readability also if ($info->mode & 022) { return 0; } return 1;}use IO::Handle;SOME_FH->untaint() if handle_looks_safe(*SOME_FH);</pre></blockquote>We called <tt class="literal">stat</tt> on the filehandle, not the filename, to avoid adangerous race condition. See the section <a href="ch23_02.htm#ch23-sect-hrc">Section 23.2.2, "Handling Race Conditions"</a>later in this chapter.</p><blockquote class="footnote"><a name="FOOTNOTE-5"></a><p>[5] Although you can untaint a directory handle,too, this function only works on a filehandle. That's because givena directory handle, there's no portable way to extract its filedescriptor to <tt class="literal">stat</tt>.</p></blockquote><p>Note that this routine is only a good start. A slightly moreparanoid version would check all parent directories as well, even though youcan't reliably <tt class="literal">stat</tt> a directory handle. But if any parent directoryis world-writable, you know you're in trouble whether or not there are race conditions.</p><p>Perl has its own notion of which operations are dangerous, but it'sstill possible to get into trouble with other operations that don'tcare whether they use tainted values. It's not always enough tobe careful of input. Perl output functions don't test whether theirarguments are tainted, but in some environments, this matters. Ifyou aren't careful of what you output, you might just end up spittingout strings that have unexpected meanings to whoever is processingthe output. If you're running on a terminal, special escape andcontrol codes could cause the viewer's terminal to act strangely.If you're in a web environment and you blindly spit back out data thatwas given to you, you could unknowingly produce HTML tags thatwould drastically alter the page's appearance. Worse still, somemarkup tags can even execute code back on the browser.</p><p>Imagine the common case of a guest book where visitors enter their ownmessages to be displayed when others come calling. A malicious guestcould supply unsightly HTML tags or put in<tt class="literal"><SCRIPT>...</SCRIPT></tt> sequencesthat execute code (like JavaScript) back in the browsers of subsequentguests.</p><p>Just as you should carefully check for only good characters wheninspecting tainted data that accesses resources on your own system,you should apply the same care in a web environment when presentingdata supplied by a user. For example, to strip the data of anycharacter not in the specified list of good characters, try somethinglike this:<blockquote><pre class="programlisting">$new_guestbook_entry =~ tr[_a-zA-Z0-9 ,./!?()@+*-][]dc;</pre></blockquote>You certainly wouldn't use that to clean up a filename, since youprobably don't want filenames with spaces or slashes, just forstarters. But it's enough to keep your guest book free ofsneaky HTML tags and entities. Each data-laundering case is alittle bit different, so always spend time deciding what is andwhat is not permitted. The tainting mechanism is intended to catchstupid mistakes, not to remove the need for thought.</p><a name="INDEX-3988"></a><a name="INDEX-3989"></a><a name="INDEX-3990"></a><a name="ch23-sect-cuye"></a><h3 class="sect2">23.1.2. Cleaning Up Your Environment</h3><a name="INDEX-3991"></a><a name="INDEX-3992"></a><a name="INDEX-3993"></a><a name="INDEX-3994"></a><p>When you execute another program from within your Perl script, nomatter how, Perl checks to make sure your <tt class="literal">PATH</tt>environment variable is secure. Since it came from your environment,your <tt class="literal">PATH</tt> starts out tainted, so if you try to runanother program, Perl raises an "<tt class="literal">Insecure$ENV{PATH}</tt>" exception. When you set it to a known,untainted value, Perl makes sure that each directory in that path isnonwritable by anyone other than the directory's owner and group;otherwise, it raises an "<tt class="literal">Insecure directory</tt>"exception.</p><p>You may be surprised to find that Perl cares about your<tt class="literal">PATH</tt> even when you specify the full pathname of thecommand you want to execute. It's true that with an absolutefilename, the <tt class="literal">PATH</tt> isn't used to find theexecutable to run. But there's no reason to trust the program you'rerunning not to turn right around and execute some<em class="emphasis">other</em> program and get into trouble because of the
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -