feedback1.html

来自「Data Structure Ebook」· HTML 代码 · 共 209 行

HTML
209
字号
<HTML><HEAD>
<TITLE>Data Structures and Algorithms - Assignment 1</TITLE>
</HEAD>
<BODY BGCOLOR="#ffffff">
<H1>Data Structures and Algorithms</H1>
<HR>

<CENTER><H3><FONT COLOR=red>
Feedback from assignment 1
</FONT></H3></CENTER>

<OL>
<LI><H4>Toolbuilding</H4>
If you are to be productive software engineers,
you must get yourselves into the habit of building <I>tools</I>.
Even the most trivial piece of code costs something to
<UL>
<LI>design,
<LI>write and
<LI>test and debug.
</UL>
You must learn to write for <I>re-use</I>:
the main benefit is in the time saved in test and debug -
if it's been tested thoroughly before, you can re-use it
with confidence!

<P>
In this assignment, you needed lists of random integers
(or - as at least one of you worked out - any integers) for
testing. 
So make a function which constructs such lists (in the simplest
way possible of course!), e.g.
<P>
<PRE>
int *ConsIntList( int n )
    {
    int *list = (int *)malloc( n*sizeof(int) );
    int i;
    for(i=0;i&lt;n;i++)
        list[i] = rand();
    return list;
    }
</PRE>
<P>
This function can be called as many times as needed to generate
test lists of appropriate sizes.
Once written and checked, it allows you to concentrate on other
aspects of the problem.
<P>
Timing is a similar problem: you are presented with the problem
that each machine has a slightly different method of telling the
time. So build a generic timing function that returns a time to the
best accuracy that the current machine will allow, e.g.
<P>
<PRE>
double TimeNow()
	{
	double t;
#ifdef IRIX
        /* Irix code to calculate t in secs */
#endif
#ifdef SUNOS
	/* Sun code  to calculate t in secs */
#endif
	return t;
	}
</PRE>
Note that this function is <TT><FONT COLOR=red>double</FONT></TT>:
it is designed to return seconds.
This allows it to get the maximum resolution on any machine.
<P>
The <TT>#ifdef</TT>'s are for serious programmers who want their
code to run without alteration on any machine: each compiler will
predefine some suitable string which will allow it to be identified.
You will need to check the manuals for individual compilers to
find out what the correct values for these strings are.
<P>
One of you made a starttime()/stoptime() pair and used a
static variable to store the time when the starttime function 
was called. A good idea, which can be done effectively in 
a single function:
<P>
<PRE>
/* timer.c */
static double last = 0.0;

double TimeSinceLastCall()
	{
	double t, diff;
#ifdef IRIX /* calculate t in secs as above */
..
#endif
	diff = t - last;
	last = t;
	return diff;
	}
</PRE>
This function can be called at the start of a timed section
of code (and the return value thrown away) and again at the
end of the timed section. The second return value is the
running time of the timed section.
<P>
<LI><H4>Efficiency</H4>
The assignment specification required you to run the same test for
quite a few values of <B>n</B>.
So you should have built a function which took <B>n</B> as an argument
and called it from a simple loop in <TT>main</TT> so that 
<I>one</I> run would get you all the results - 
once everything was working!
<P>
This is akin to the "toolbuilding" approach above.
<P>
<LI><H4>Errors</H4>
Unix is a time-sharing OS: even if you're apparently the only
user on the machine, many things are usually happening.
At the minimum, there's the clock interrupt that keeps the
machine's time correct. Usually, there'll also be printer
daemons (which wake up periodically to check whether you want
to print anything), network daemons (which look to see if
anyone else is trying to log in to this machine), <I>etc, etc</I>.
Even the simplest operating system
(<I>eg</I> DOS) will have timer interrupts to perturb your benchmark
timing!
<P>
As a consequence, very accurate timing is impossible. 
If you're expecting a straight line
(and you can usually transform your results so that they
should be a straight line),
a statistician would take the plot and determine the 
<I>correlation coefficient</I>, which would give a measure
of the probability that the results deviate from a straight
line (taking into account experimental error).
However, you can just run a straight line through your results
and if all the points lie close to it and appear randomly on
both sides, then it's a straight line.
With a little extra effort, you could use your calculator -
or get a spreadsheet - to calculate the standard deviation
from the mean of the normalised results. 
This will give you a simple measure of whether your results
lie on a straight line or not.

<P>
<LI><H4>Reporting your results</H4>
Having determined the form of your results, it was essential
that you report your final conclusions in the <B>O</B> notation.
This is the standard way of describing the performance of an
algorithm.
<P>
Reporting the slope of the straight line is a nice touch
<I>(especially if you add the correlation coefficient -
which tells how good a straight line it is!)</I>,
but your ultimate conclusion should still be <B>O(1)</B>
or <B>O(n)</B>.
<P>
<LI><H4>Explaining your results</H4>
The alert among you noted that the AddToCollection routine
appeared to be <B>O(n)</B> because of the 
<TT>FindInCollection</TT> routine in the post-condition.
Having noted that, it's a simple matter to remove it and
re-run your experiment. You could either simply comment it out -
or better - read the man page for <TT>assert</TT> and 
discover that if you recompile with:
<P>
<TT>gcc -DNDEBUG *.c</TT>
<P>
all the <TT>assert</TT>'s will be automagically removed!
<P>
<LI><H4>Designing experiments</H4>
Many people searched their collections for the same items
as the ones in the list used to construct the collection
in the same order!
So the first item is found after 1 operation,
the second after 2, <I>etc</I>
(or the reverse for LIFO lists).
Duplicates complicate the analysis and reduce the times.
A better design either searched for an item that was
guaranteed not to be in the list (to give worst case
time complexity) or
generated a new random sequence of items to search for.
In the second case, it's important to ensure a low probability
that the randomly generated item will be found,
<I>ie</I> the range of the random numbers must be much
greater than the number of items.
<BR>
</OL>
<P>
<H4>Users of DOS/Windows machines</H4>
Please make sure that the extra CRs needed by DOS, etc, are
removed before you submit.
Unix files should have LFs only at the end of a line.
<P>
The same applies to reports produced by your word processor:
export them as plain text without the extra CRs.
<P>
Reports which are not plain text will not be accepted -
there are a large number of word processors out there:
it's much more productive (and therefore beneficial for you!)
if the tutors spend time marking
your report's content than trying to work out which WP
will read it!
<HR>
<A HREF="ds_ToC.html" tppabs="http://www.ee.uwa.edu.au/~plsd210/ds/ds_ToC.html">Table of Contents</A>
<HR>
<SMALL>
&copy; <A HREF=mailto:morris@ee.uwa.edu.au>John Morris</A>, 1996
</SMALL>
</BODY>
</HTML>

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?