wwwtc6answer.htm
来自「C++builder学习资料C++builder」· HTM 代码 · 共 979 行 · 第 1/3 页
HTM
979 行
<P>
<TABLE WIDTH="75%">
<TR>
<TD VALIGN="top">
<IMG SRC="images/exclamation.gif" ALT="Tip" BORDER=0 HSPACE="0" ALIGN="top" width="32" height="48">
</TD>
<TD valign="top">
<b>Tip:</b>
<hr size = 1>
Never blindly double-increment an iterator. When you must
increment an iterator, be sure it is not equal to <TT>end()</TT>.
<hr size = 1>
</TD>
</TR>
</TABLE>
<P>
<B>Question #8:</B>
</P>
<pre>
std<b>:</b><b>:</b>mismatch<b>(</b>v1<b>.</b>begin<b>(</b><b>)</b><b>,</b> v2<b>.</b>begin<b>(</b><b>)</b><b>,</b> v2<b>.</b>end<b>(</b><b>)</b><b>)</b><b>;</b>
</pre>
<P>
Something is wrong with this call that has undefined runtime behavior
(though you can reasonably expect a crash.) What?
</P>
<P>
<B>Answer #1:</B>
</P>
<P>
<TT>std::mismatch</TT> scans two sequences to find the first element that is
not equal. It takes a range for one of the sequences, and a
starting point for the second sequence. It assumes that the second
sequence is at least as long as the first.
<P>
</P>
The functionality of mismatch is really irrelevant to the problem
with the code though. The argument list in the above code is where
you will find the problem. The programmer is passing in the wrong
iterators in the wrong places. A parameter mismatch, if you like
puns.
<P>
<TT>std::mismatch</TT> is declared like this:
</P>
<pre>
<b>template</b> <<b>class</b> InputIterator1<b>,</b> <b>class</b> InputIterator2<b>></b>
pair<InputIterator1<b>,</b> InputIterator2<b>></b> mismatch<b>(</b>InputIterator1 first1<b>,</b>
InputIterator1 last1<b>,</b>
InputIterator2 first2<b>)</b><b>;</b>
</pre>
<P>
If the template syntax is distracting you, just look at the
parameters the function takes: three input iterators. <TT>first1</TT> and
<TT>last1</TT> specify a range in the first sequence, and <TT>first2</TT> is an
iterator to the beginning of the second sequence. That is, the first
two arguments form a range of one sequence, the third argument is
the start of another sequence.
</P>
<P>
So what's wrong with the original code? The programmer thought that
the first argument was the start of a sequence, the second parameter
was the start of the second sequence, and the third parameter was
the end of the second sequence. Passing the iterators in the wrong
order has some very serious consequences:
</P>
<table>
<tr>
<td valign="top"> <B>1)</B> </TD>
<td>The range specified is invalid. Iterating through the range that
the programmer provided would probably cause a memory fault or a
program crash. Actually, it's undefined, but undefined can be
translated to mean "bad things will happen." Usually a crash.
A range specified by a pair of iterators has some rules:
The range specified must be two iterators in the SAME sequence,
and the first iterator must be capable of reaching the second
iterator by repeated application of <TT>operator++</TT>.
</td></tr>
<tr>
<td valign="top"> <B>2)</B> </TD>
<td>
Because the third parameter is expected to contain the iterator
corresponding to the beginning of the second sequence, the
mismatch algorithm will start one past the end, and increment
from there, a recipe for disaster.
</td>
</tr>
</table>
<P>
Unfortunately, the code compiles without any problems. Here is the correct
way to call the function:
</P>
<pre>
V <b>*</b> seq1<b>;</b>
V <b>*</b> seq2<b>;</b>
<b>if</b> <b>(</b>v1<b>.</b>size<b>(</b><b>)</b> < v2<b>.</b>size<b>(</b><b>)</b><b>)</b>
<b>{</b>
seq1 <b>=</b> <b>&</b>v1<b>;</b>
seq2 <b>=</b> <b>&</b>v2<b>;</b>
<b>}</b>
<b>else</b>
<b>{</b>
seq1 <b>=</b> <b>&</b>v2<b>;</b>
seq2 <b>=</b> <b>&</b>v1<b>;</b>
<b>}</b>
std<b>:</b><b>:</b>mismatch<b>(</b>seq1<b>-></b>begin<b>(</b><b>)</b><b>,</b> seq1<b>-></b>end<b>(</b><b>)</b><b>,</b> seq2<b>-></b>begin<b>(</b><b>)</b><b>)</b><b>;</b>
</pre>
<P>
The extra work done before the call to mismatch is simply setting up
intermediate pointers to the lists, such that we can guarantee that
the second sequence is at least as large as the first. This way we
never walk past the end of a sequence.
</p>
<P>
<TABLE WIDTH="75%">
<TR>
<TD VALIGN="top">
<IMG SRC="images/exclamation.gif" ALT="Tip" BORDER=0 HSPACE="0" ALIGN="top" width="32" height="48">
</TD>
<TD valign="top">
<b>Tip:</b>
<hr size = 1>
There is no way for the compiler to detect a problem when the
programmer gives the iterators in the wrong order. Therefore when
you use STL algorithms, slow down just a bit and make sure you're
using them correctly, and passing in the arguments in the correct
order. Check the documentation (or header files!) to make SURE that
what you think the algorithms expect are really what you provide.
<hr size = 1>
</TD>
</TR>
</TABLE>
<P>
<TABLE WIDTH="75%">
<TR>
<TD VALIGN="top">
<IMG SRC="images/exclamation.gif" ALT="Tip" BORDER=0 HSPACE="0" ALIGN="top" width="32" height="48">
</TD>
<TD valign="top">
<b>Tip:</b>
<hr size = 1>
If a program is crashing in an STL algorithm, always think
of potential problems with iterators first: could they have been
invalidated? Are the ranges correct? Are my arguments to functions
in the correct order? When applicable, is the second sequence long
enough (for example, <TT>mismatch()</TT>, <TT>equal()</TT>, <TT>search()</TT>, etc)?
<hr size = 1>
</TD>
</TR>
</TABLE>
<P>
<B>Question #9:</B>
</P>
<pre>
<font color="navy">// call do_something_with() for each item in v1 that is not even</font>
<b>extern</b> <b>void</b> do_something_with<b>(</b>A<b>*</b><b>)</b><b>;</b>
<b>void</b> f<b>(</b><b>)</b>
<b>{</b>
V<b>:</b><b>:</b>iterator i <b>=</b> v1<b>.</b>begin<b>(</b><b>)</b><b>;</b>
<b>while</b> <b>(</b>i <b>!=</b> v1<b>.</b>end<b>(</b><b>)</b><b>)</b>
<b>{</b>
<b>if</b> <b>(</b><b>(</b><b>*</b>i<b>)</b><b>-></b>is_even<b>(</b><b>)</b><b>)</b>
<b>{</b>
<font color="navy">// skip this item</font>
<b>++</b>i<b>;</b>
<b>}</b>
do_something_with<b>(</b><b>*</b>i<b>)</b><b>;</b>
i<b>++</b><b>;</b>
<b>}</b>
<b>}</b>
</pre>
<P>
What bugs are in this code? It not only contains logic bugs (it
doesn't do what the documentation (comment at top) says), but it
has iterator problems as well. How many can you find?
</P>
<P>
<B>Answer #1:</B>
</P>
<P>
Hopefully, from the earlier problems, this one is a piece of cake.
Do you see the problem? Two things already discussed should jump
out, one big, one small. Plus, though perhaps a little subtle,
there is a logic bug lurking in there as well, but fixing the first
two problems will also fix the logic bug for free!
</P>
<P>
First the small problem. <TT>i++</TT> by itself is inefficient. It should
be <TT>++i</TT> or we are wasting CPU cycles. This is especially important in
loops, where the overhead is multiplied by the number of
iterations. This was discussed earlier in item #5.
<P>
</p>
The bigger problem is that, once again, we are doing a blind double-increment, without
checking that <TT>i</TT> is not already equal to <TT>end()</TT> after the first increment.
(I say "blind" because it is not checking for the problem, assuming the problem cannot (or will not)
happen.)
<P>
</p>
The logic bug also exists in the "skip even" logic, also related to
the blind-double-increment. The problem is that if we check the
very last element for evenness, and it is, we advance the iterator
to one-past-the-end. Since we blindly increment it again, the
iterator is now one-past the one-past-end element, and program
behavior is undefined. This is the danger of double increments: you
can accidentally jump "over" the loop termination condition, and then
practically enter an infinite loop (at least until your program
crashes.) Accessing past the end of a sequence is undefined and out
of bounds.
<p>
As for the logic bug, suppose the vector has all even numbers in it
(or <TT>is_even()</TT> returns true for every element.) This code will skip the
first even, and then immediately <TT>do_something_with()</TT> the next value
without first checking that it's not even. Since the documentation
at the top of the function says that <TT>do_something_with()</TT> shall be
called only on items that are not even, this is a logic bug. The
above code is capable of calling <TT>do_something_with()</TT> on elements
that are even. This is another side-effect of blindly incrementing
the loop variable.
</p>
<P>
<TABLE WIDTH="75%">
<TR>
<TD VALIGN="top">
<IMG SRC="images/exclamation.gif" ALT="Tip" BORDER=0 HSPACE="0" ALIGN="top" width="32" height="48">
</TD>
<TD valign="top">
<b>Tip:</b>
<hr size = 1>
Never increment your loop control variable in more than
one place in the body of your loop, whenever you can help it. When
you think you need to, think twice before coding it. That is, it's
wise to treat double incrementing a loop variable as "incorrect"
until proven otherwise.
<hr size = 1>
</TD>
</TR>
</TABLE>
<P>
Here is a better implementation that restructures the logic to
eliminate the extra increment, so it is only incremented in one
place. This also fixes the logic bug described earlier. (Isn't it
funny how removing a double-increment can fix more than one problem?)
Hopefully you're convinced by now that incrementing a loop variable
multiple times can be bad, if the conditions are not checked before
each increment. This is especially true of iterators, even if they
are not in a loop they can walk past the end if they're double
incremented blindly.
</P>
<pre>
<font color="navy">//</font>
<font color="navy">// correct rewrite</font>
<font color="navy">//</font>
<b>void</b> f<b>(</b><b>)</b>
<b>{</b>
V<b>:</b><b>:</b>iterator i <b>=</b> v1<b>.</b>begin<b>(</b><b>)</b><b>;</b>
<b>while</b> <b>(</b>i <b>!=</b> v1<b>.</b>end<b>(</b><b>)</b><b>)</b>
<b>{</b>
<b>if</b> <b>(</b><b>!</b><b>(</b><b>*</b>i<b>)</b><b>-></b>is_even<b>(</b><b>)</b><b>)</b>
<b>{</b>
do_something_with<b>(</b><b>*</b>i<b>)</b><b>;</b>
<b>}</b>
<b>++</b>i<b>;</b>
<b>}</b>
<b>}</b>
</pre>
<P>
Here we simplified it to only <TT>do_something_with()</TT> if the value is
not even. This code is much more straight forward, and now bug
free. (Well, assuming that elements cannot be null.)
</P>
<P>
The above function can still be improved, for those wishing to
squeeze every ounce of efficiency as possible out of the compiler.
The remaining concern is the overhead of returning a temporary value
from <TT>end()</TT> each time we iterate through the loop. This is solely an
efficiency issue, and while probably minor, there is no reason not
to fix this because it's clean and elegant, and faster.
</p>
<p>
Since the container cannot change in size, (since the body of the
loop does not modify it), we can be sure that <TT>end()</TT> will always
return an iterator to the same element. Therefore, there is
no need to call <TT>end()</TT> and recalculate the end for each iteration,
since it always returns the same value. Even though it's an inline
function, it is not "free" due to the construction of a temporary
iterator. Iterators are objects (or can be implemented that way),
and they have a cost to create. Fortunately, we can cache the
return value, and then compare against our cached value rather than
recalculating it each iteration. We can be more efficient if we
make the minor change as follows:
</p>
<pre>
<b>void</b> f<b>(</b><b>)</b>
<b>{</b>
V<b>:</b><b>:</b>iterator i <b>=</b> v1<b>.</b>begin<b>(</b><b>)</b><b>;</b>
V<b>:</b><b>:</b>iterator e <b>=</b> v1<b>.</b>end<b>(</b><b>)</b><b>;</b> <font color="navy">// cache end()'s return value</font>
<b>while</b> <b>(</b>i <b>!=</b> e<b>)</b><b>)</b>
<b>{</b>
<b>if</b> <b>(</b><b>!</b><b>(</b><b>*</b>i<b>)</b><b>-></b>is_even<b>(</b><b>)</b><b>)</b>
<b>{</b>
do_something_with<b>(</b><b>*</b>i<b>)</b><b>;</b>
<b>}</b>
<b>++</b>i<b>;</b>
<b>}</b>
<b>}</b>
</pre>
<P>
And there you have it, fast, bug free, and easy.
</p>
</TD> </TR>
</TABLE>
</CENTER>
</BODY>
</HTML>
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?