lawn81.tex

来自「famous linear algebra library (LAPACK) p」· TEX 代码 · 共 1,501 行 · 第 1/5 页

TEX
1,501
字号
%If you are installing LAPACK on a Silicon Graphics machine, you must
%modify the definition of \texttt{testing} to be
%\begin{verbatim}
%testing:
%        ( cd TESTING; $(MAKE) -f Makefile.sgi )
%\end{verbatim}
 
\subsubsection{Testing the Linear Equations Routines}\label{testlin}

\begin{itemize}

\item[a)]
Go to \texttt{LAPACK/TESTING/LIN} and type \texttt{make} followed by the data types
desired.  The executable files are called \texttt{xlintsts, xlintstc,
xlintstd}, or \texttt{xlintstz} and are created in \texttt{LAPACK/TESTING}.

\item[b)]
Go to \texttt{LAPACK/TESTING} and run the tests for each data type.
For the REAL version, the command is
\begin{list}{}{}
\item{} \texttt{xlintsts  < stest.in > stest.out}
\end{list}

\noindent
The tests using \texttt{xlintstd}, \texttt{xlintstc}, and \texttt{xlintstz} are similar
with the leading `s' in the input and output file names replaced
by `d', `c', or `z'.

\end{itemize}

If you encountered failures in this phase of the testing process, please
refer to Section~\ref{sendresults}.

\subsubsection{Testing the Eigensystem Routines}\label{testeig}

\begin{itemize}

\item[a)]
Go to \texttt{LAPACK/TESTING/EIG} and type \texttt{make} followed by the data types
desired.  The executable files are called \texttt{xeigtsts,
xeigtstc, xeigtstd}, and \texttt{xeigtstz} and are created
in \texttt{LAPACK/TESTING}.

\item[b)]
Go to \texttt{LAPACK/TESTING} and run the tests for each data type.
The tests for the eigensystem routines use eighteen separate input files
for testing the nonsymmetric eigenvalue problem,
the symmetric eigenvalue problem, the banded symmetric eigenvalue
problem, the generalized symmetric eigenvalue
problem, the generalized nonsymmetric eigenvalue problem, the
singular value decomposition, the banded singular value decomposition,
the generalized singular value
decomposition, the generalized QR and RQ factorizations, the generalized
linear regression model, and the constrained linear least squares
problem.
The tests for the REAL version are as follows:
\begin{list}{}{}
\item \texttt{xeigtsts  < nep.in > snep.out}
\item \texttt{xeigtsts  < sep.in > ssep.out}
\item \texttt{xeigtsts  < svd.in > ssvd.out}
\item \texttt{xeigtsts  < sec.in > sec.out}
\item \texttt{xeigtsts  < sed.in > sed.out}
\item \texttt{xeigtsts  < sgg.in > sgg.out}
\item \texttt{xeigtsts  < sgd.in > sgd.out}
\item \texttt{xeigtsts  < ssg.in > ssg.out}
\item \texttt{xeigtsts  < ssb.in > ssb.out}
\item \texttt{xeigtsts  < sbb.in > sbb.out}
\item \texttt{xeigtsts  < sbal.in > sbal.out}
\item \texttt{xeigtsts  < sbak.in > sbak.out}
\item \texttt{xeigtsts  < sgbal.in > sgbal.out}
\item \texttt{xeigtsts  < sgbak.in > sgbak.out}
\item \texttt{xeigtsts  < glm.in > sglm.out}
\item \texttt{xeigtsts  < gqr.in > sgqr.out}
\item \texttt{xeigtsts  < gsv.in > sgsv.out}
\item \texttt{xeigtsts  < lse.in > slse.out}
\end{list}
The tests using \texttt{xeigtstc}, \texttt{xeigtstd}, and \texttt{xeigtstz} also
use the input files \texttt{nep.in}, \texttt{sep.in}, \texttt{svd.in},
\texttt{glm.in}, \texttt{gqr.in}, \texttt{gsv.in}, and \texttt{lse.in},
but the leading `s' in the other input file names must be changed
to `c', `d', or `z'.
\end{itemize}

If you encountered failures in this phase of the testing process, please
refer to Section~\ref{sendresults}.

\subsection{Run the LAPACK Timing Programs (For LAPACK 3.0 and before)}

There are two distinct timing programs for LAPACK routines
in each data type, one for the linear equation routines and
one for the eigensystem routines.  The timing program for the
linear equation routines is also used to time the BLAS.
We encourage you to conduct these timing experiments
in REAL and COMPLEX or in DOUBLE PRECISION and COMPLEX*16; it is
not necessary to send timing results in all four data types.

Two sets of input files are provided, a small set and a large set.
The small data sets are appropriate for a standard workstation or
other non-vector machine.
The large data sets are appropriate for supercomputers, vector
computers, and high-performance workstations.
We are mainly interested in results from the large data sets, and
it is not necessary to run both the large and small sets.
The values of N in the large data sets are about five times larger
than those in the small data set,
and the large data sets use additional values for parameters such as the
block size NB and the leading array dimension LDA.
Small data sets finished with the \_small in their name , such as
\texttt{stime\_small.in}, and large data sets finished with \_large in their name,
such as \texttt{stime\_large.in}.
Except as noted, the leading `s' in the input file name must be
replaced by `d', `c', or `z' for the other data types.

We encourage you to obtain timing results with the large data sets,
as this allows us to compare different machines.
If this would take too much time, suggestions for paring back the large
data sets are given in the instructions below.
We also encourage you to experiment with these timing
programs and send us any interesting results, such as results for
larger problems or for a wider range of block sizes.
The main programs are dimensioned for the large data sets,
so the parameters in the main program may have to be reduced in order
to run the small data sets on a small machine, or increased to run
experiments with larger problems.

The minimum time each subroutine will be timed is set to 0.0 in
the large data files and to 0.05 in the small data files, and on
many machines this value should be increased.
If the timing interval is not long
enough, the time for the subroutine after subtracting the overhead
may be very small or zero, resulting in megaflop rates that are
very large or zero. (To avoid division by zero, the megaflop rate is
set to zero if the time is less than or equal to zero.)
The minimum time that should be used depends on the machine and the
resolution of the clock.

For more information on the timing programs and how to modify the
input files, please refer to LAPACK Working Note 41~\cite{WN41}.
% see Section~\ref{moretiming}.

If you do not wish to run each of the timings individually, you can
go to \texttt{LAPACK}, edit the definition \texttt{lapack\_timing} in the file
\texttt{Makefile} to specify the data types desired, and type \texttt{make
lapack\_timing}.  This will compile
and run the timings for the linear equation routines and the eigensystem
routines (see Sections~\ref{timelin} and ~\ref{timeeig}). 

%If you are installing LAPACK on a Silicon Graphics machine, you must
%modify the definition of \texttt{timing} to be
%\begin{verbatim}
%timing:
%        ( cd TIMING; $(MAKE) -f Makefile.sgi )
%\end{verbatim}

If you encounter failures in any phase of the timing process, please
feel free to contact the authors as directed in Section~\ref{sendresults}.
Tell us the 
type of machine on which the tests were run, the version of the operating
system, the compiler and compiler options that were used,
and details of the BLAS library or libraries that you used.  You should
also include a copy of the output file in which the failure occurs.

Please note that the BLAS
timing runs will still need to be run as instructed in ~\ref{timeblas}.

\subsubsection{Timing the Linear Equations Routines}\label{timelin}

The linear equation timing program is found in \texttt{LAPACK/TIMING/LIN}
and the input files are in \texttt{LAPACK/TIMING}.
Three input files are provided in each data type for timing the
linear equation routines, one for square matrices, one for band
matrices, and one for rectangular matrices.  The small data sets for the REAL version
are \texttt{stime\_small.in}, \texttt{sband\_small.in}, and \texttt{stime2\_small.in}, respectively,
and the large data sets are
\texttt{stime\_large.in}, \texttt{sband\_large.in}, and \texttt{stime2\_large.in}.

The timing program for the least squares routines uses special instrumented
versions of the LAPACK routines to time individual sections of the code.
The first step in compiling the timing program is therefore to make a library
of the instrumented routines.

\begin{itemize}
\item[a)]
\begin{sloppypar}
To make a library of the instrumented LAPACK routines, first
go to \texttt{LAPACK/TIMING/LIN/LINSRC} and type \texttt{make} followed
by the data types desired, as in the examples of Section~\ref{toplevelmakefile}. 
The library of instrumented code is created in
\texttt{LAPACK/TIMING/LIN/linsrc\_PLAT.a},
where \texttt{PLAT} is the user-defined architecture suffix specified in the
file \texttt{LAPACK/make.inc}.
\end{sloppypar}

\item[b)]
To make the linear equation timing programs,
go to \texttt{LAPACK/TIMING/LIN} and type \texttt{make} followed by the data
types desired, as in the examples in Section~\ref{toplevelmakefile}.
The executable files are called \texttt{xlintims},
\texttt{xlintimc}, \texttt{xlintimd}, and \texttt{xlintimz} and are created
in \texttt{LAPACK/TIMING}.

\item[c)]
Go to \texttt{LAPACK/TIMING} and
make any necessary modifications to the input files.
You may need to set the minimum time a subroutine will
be timed to a positive value, or to restrict the size of the tests
if you are using a computer with performance in between that of a
workstation and that of a supercomputer.
The computational requirements can be cut in half by using only one
value of LDA.
If it is necessary to also reduce the matrix sizes or the values of
the blocksize, corresponding changes should be made to the 
BLAS input files (see Section~\ref{timeblas}).

\item[d)]
Run the programs for each data type you are using. 
For the REAL version, the commands for the small data sets are

\begin{list}{}{}
\item{} \texttt{xlintims < stime\_small.in > stime\_small.out }
\item{} \texttt{xlintims < sband\_small.in > sband\_small.out }
\item{} \texttt{xlintims < stime2\_small.in > stime2\_small.out }
\end{list}
or the commands for the large data sets are
\begin{list}{}{}
\item{} \texttt{xlintims < stime\_large.in > stime\_large.out }
\item{} \texttt{xlintims < sband\_large.in > sband\_large.out }
\item{} \texttt{xlintims < stime2\_large.in > stime2\_large.out }
\end{list}

\noindent
Similar commands should be used for the other data types.
\end{itemize}

\subsubsection{Timing the BLAS}\label{timeblas}

The linear equation timing program is also used to time the BLAS.
Three input files are provided in each data type for timing the Level
2 and 3 BLAS. 
These input files time the BLAS using the matrix shapes encountered
in the LAPACK routines, and we will use the results to analyze the
performance of the LAPACK routines. 
For the REAL version, the small data files are
\texttt{sblasa\_small.in}, \texttt{sblasb\_small.in}, and \texttt{sblasc\_small.in}
and the large data files are
\texttt{sblasa\_large.in}, \texttt{sblasb\_large.in}, and \texttt{sblasc\_large.in}.
There are three sets of inputs because there are three
parameters in the Level 3 BLAS, M, N, and K, and
in most applications one of these parameters is small (on the order
of the blocksize) while the other two are large (on the order of the
matrix size).  
In \texttt{sblasa\_small.in}, M and N are large but K is
small, while in \texttt{sblasb\_small.in} the small parameter is M, and
in \texttt{sblasc\_small.in} the small parameter is N.  
The Level 2 BLAS are timed only in the first data set, where K
is also used as the bandwidth for the banded routines.

\begin{itemize}

\item[a)]
Go to \texttt{LAPACK/TIMING} and
make any necessary modifications to the input files.
You may need to set the minimum time a subroutine will
be timed to a positive value.
If you modified the values of N or NB 
in Section~\ref{timelin}, set M, N, and K accordingly.
The large parameters among M, N, and K
should be the same as the matrix sizes used in timing the linear
equation routines,
and the small parameter should be the same as the
blocksizes used in timing the linear equation routines.
If necessary, the large data set can be simplified by using only one
value of LDA.

\item[b)]
Run the programs for each data type you are using. 
For the REAL version, the commands for the small data sets are

\begin{list}{}{}
\item{} \texttt{xlintims < sblasa\_small.in > sblasa\_small.out }
\item{} \texttt{xlintims < sblasb\_small.in > sblasb\_small.out }
\item{} \texttt{xlintims < sblasc\_small.in > sblasc\_small.out }
\end{list}
or the commands for the large data sets are
\begin{list}{}{}
\item{} \texttt{xlintims < sblasa\_large.in > sblasa\_large.out }
\item{} \texttt{xlintims < sblasb\_large.in > sblasb\_large.out }
\item{} \texttt{xlintims < sblasc\_large.in > sblasc\_large.out }
\end{list}

\noindent
Similar commands should be used for the other data types.
\end{itemize}

\subsubsection{Timing the Eigensystem Routines}\label{timeeig}

The eigensystem timing program is found in \texttt{LAPACK/TIMING/EIG}
and the input files are in \texttt{LAPACK/TIMING}.
Four input files are provided in each data type for timing the
eigensystem routines,
one for the generalized nonsymmetric eigenvalue problem, 

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?