lawn81.tex
来自「famous linear algebra library (LAPACK) p」· TEX 代码 · 共 1,501 行 · 第 1/5 页
TEX
1,501 行
one for the nonsymmetric eigenvalue problem,
one for the symmetric and generalized symmetric eigenvalue problem,
and one for the singular value decomposition.
For the REAL version, the small data sets are called \texttt{sgeptim\_small.in},
\texttt{sneptim\_small.in}, \texttt{sseptim\_small.in}, and \texttt{ssvdtim\_small.in}, respectively.
and the large data sets are called \texttt{sgeptim\_large.in}, \texttt{sneptim\_large.in},
\texttt{sseptim\_large.in}, and \texttt{ssvdtim\_large.in}.
Each of the four input files reads a different set of parameters,
and the format of the input is indicated by a 3-character code
on the first line.
The timing program for eigenvalue/singular value routines accumulates
the operation count as the routines are executing using special
instrumented versions of the LAPACK routines. The first step in
compiling the timing program is therefore to make a library of the
instrumented routines.
\begin{itemize}
\item[a)]
\begin{sloppypar}
To make a library of the instrumented LAPACK routines, first
go to \texttt{LAPACK/TIMING/EIG/EIGSRC} and type \texttt{make} followed
by the data types desired, as in the examples of Section~\ref{toplevelmakefile}.
The library of instrumented code is created in
\texttt{LAPACK/TIMING/EIG/eigsrc\_PLAT.a},
where \texttt{PLAT} is the user-defined architecture suffix specified in the
file \texttt{LAPACK/make.inc}.
\end{sloppypar}
\item[b)]
To make the eigensystem timing programs,
go to \texttt{LAPACK/TIMING/EIG} and
type \texttt{make} followed by the data types desired, as in the examples
of Section~\ref{toplevelmakefile}. The executable files are called
\texttt{xeigtims}, \texttt{xeigtimc}, \texttt{xeigtimd}, and \texttt{xeigtimz}
and are created in \texttt{LAPACK/TIMING}.
\item[c)]
Go to \texttt{LAPACK/TIMING} and
make any necessary modifications to the input files.
You may need to set the minimum time a subroutine will
be timed to a positive value, or to restrict the number of tests
if you are using a computer with performance in between that of a
workstation and that of a supercomputer.
Instead of decreasing the matrix dimensions to reduce the time,
it would be better to reduce the number of matrix types to be timed,
since the performance varies more with the matrix size than with the
type. For example, for the nonsymmetric eigenvalue routines,
you could use only one matrix of type 4 instead of four matrices of
types 1, 3, 4, and 6.
Refer to LAPACK Working Note 41~\cite{WN41} for further details.
% See Section~\ref{moretiming} for further details.
\item[d)]
Run the programs for each data type you are using.
For the REAL version, the commands for the small data sets are
\begin{list}{}{}
\item{} \texttt{xeigtims < sgeptim\_small.in > sgeptim\_small.out }
\item{} \texttt{xeigtims < sneptim\_small.in > sneptim\_small.out }
\item{} \texttt{xeigtims < sseptim\_small.in > sseptim\_small.out }
\item{} \texttt{xeigtims < ssvdtim\_small.in > ssvdtim\_small.out }
\end{list}
or the commands for the large data sets are
\begin{list}{}{}
\item{} \texttt{xeigtims < sgeptim\_large.in > sgeptim\_large.out }
\item{} \texttt{xeigtims < sneptim\_large.in > sneptim\_large.out }
\item{} \texttt{xeigtims < sseptim\_large.in > sseptim\_large.out }
\item{} \texttt{xeigtims < ssvdtim\_large.in > ssvdtim\_large.out }
\end{list}
\noindent
Similar commands should be used for the other data types.
\end{itemize}
\subsection{Send the Results to Tennessee}\label{sendresults}
Congratulations! You have now finished installing, testing, and
timing LAPACK. If you encountered failures in any phase of the
testing or timing process, please
consult our \texttt{release\_notes} file on netlib.
\begin{quote}
\url{http://www.netlib.org/lapack/release\_notes}
\end{quote}
This file contains machine-dependent installation clues which hopefully will
alleviate your difficulties or at least let you know that other users
have had similar difficulties on that machine. If there is not an entry
for your machine or the suggestions do not fix your problem, please feel
free to contact the authors at
\begin{list}{}{}
\item \href{mailto:lapack@cs.utk.edu}{\texttt{lapack@cs.utk.edu}}.
\end{list}
Tell us the
type of machine on which the tests were run, the version of the operating
system, the compiler and compiler options that were used,
and details of the BLAS library or libraries that you used. You should
also include a copy of the output file in which the failure occurs.
We would like to keep our \texttt{release\_notes} file as up-to-date as possible.
Therefore, if you do not see an entry for your machine, please contact us
with your testing results.
Comments and suggestions are also welcome.
We encourage you to make the LAPACK library available to your
users and provide us with feedback from their experiences.
%This release of LAPACK is not guaranteed to be compatible
%with any previous test release.
\subsection{Get support}\label{getsupport}
First, take a look at the complete installation manual in the LAPACK Working Note 41~\cite{WN41}.
if you still cannot solve your problem, you have 2 ways to go:
\begin{itemize}
\item
either send a post in the LAPACK forum
\begin{quote}
\url{http://icl.cs.utk.edu/lapack-forum}
\end{quote}
\item
or send an email to the LAPACK mailing list:
\begin{list}{}{}
\item \href{mailto:lapack@cs.utk.edu}{\texttt{lapack@cs.utk.edu}}.
\end{list}
\end{itemize}
\section*{Acknowledgments}
Ed Anderson and Susan Blackford contributed to previous versions of this report.
\appendix
\chapter{Caveats}\label{appendixd}
In this appendix we list a few of the machine-specific difficulties we
have
encountered in our own experience with LAPACK. A more detailed list
of machine-dependent problems, bugs, and compiler errors encountered
in the LAPACK installation process is maintained
on \emph{netlib}.
\begin{quote}
\url{http://www.netlib.org/lapack/release\_notes}
\end{quote}
We assume the user has installed the machine-specific routines
correctly and that the Level 1, 2 and 3 BLAS test programs have run
successfully, so we do not list any warnings associated with those
routines.
\section{\texttt{LAPACK/make.inc}}
All machine-specific
parameters are specified in the file \texttt{LAPACK/make.inc}.
The first line of this \texttt{make.inc} file is:
\begin{quote}
SHELL = /bin/sh
\end{quote}
and will need to be modified to \texttt{SHELL = /sbin/sh} if you are
installing LAPACK on an SGI architecture.
\section{ETIME}
On HPPA architectures,
the compiler and loader flag \texttt{+U77} should be included to access
the function \texttt{ETIME}.
\section{ILAENV and IEEE-754 compliance}
%By default, ILAENV (\texttt{LAPACK/SRC/ilaenv.f}) assumes an IEEE and IEEE-754
%compliant architecture, and thus sets (\texttt{ILAENV=1}) for (\texttt{ISPEC=10})
%and (\texttt{ISPEC=11}) settings in ILAENV.
%
%If you are installing LAPACK on a non-IEEE machine, you MUST modify ILAENV,
%as this test inside ILAENV will crash!
As some new routines in LAPACK rely on IEEE-754 compliance,
two settings (\texttt{ISPEC=10} and \texttt{ISPEC=11}) have been added to ILAENV
(\texttt{LAPACK/SRC/ilaenv.f}) to denote IEEE-754 compliance for NaN and
infinity arithmetic, respectively. By default, ILAENV assumes an IEEE
machine, and does a test for IEEE-754 compliance. \textbf{NOTE: If you
are installing LAPACK on a non-IEEE machine, you MUST modify ILAENV,
as this test inside ILAENV will crash!}
Thus, for non-IEEE machines, the user must hard-code the setting of
(\texttt{ILAENV=0}) for (\texttt{ISPEC=10} and \texttt{ISPEC=11}) in the version
of \texttt{LAPACK/SRC/ilaenv.f} to be put in
his library. For further details, refer to section~\ref{testieee}.
Be aware
that some IEEE compilers by default do not enforce IEEE-754 compliance, and
a compiler flag must be explicitly set by the user.
On SGIs for example, you must set the \texttt{-OPT:IEEE\_NaN\_inf=ON} compiler
flag to enable IEEE-754 compliance.
And lastly, the test inside ILAENV to detect IEEE-754 compliance, will
result in IEEE exceptions for ``Divide by Zero'' and ``Invalid Operation''.
Thus, if the user is installing on a machine that issues IEEE exception
warning messages (like a Sun SPARCstation), the user can disregard these
messages. To avoid these messages, the user can hard-code the values
inside ILAENV as explained in section~\ref{testieee}.
\section{Lack of \texttt{/tmp} space}
If \texttt{/tmp} space is small (i.e., less than approximately 16 MB) on your
architecture, you may run out of space
when compiling. There are a few possible solutions to this problem.
\begin{enumerate}
\item You can ask your system administrator to increase the size of the
\texttt{/tmp} partition.
\item You can change the environment variable \texttt{TMPDIR} to point to
your home directory for temporary space. E.g.,
\begin{quote}
\texttt{setenv TMPDIR /home/userid/}
\end{quote}
where \texttt{/home/userid/} is the user's home directory.
\item If your archive command has an \texttt{l} option, you can change the
archive command to \texttt{ar crl} so that the
archive command will only place temporary files in the current working
directory rather than in the default temporary directory /tmp.
\end{enumerate}
\section{BLAS}
If you suspect a BLAS-related problem and you are linking
with an optimized version of the BLAS, we would strongly suggest
as a first step that you link to the Fortran 77 version of
the suspected BLAS routine and see if the error has disappeared.
We have included test programs for the Level 1 BLAS.
Users should therefore beware of a common problem in machine-specific
implementations of xNRM2,
the function to compute the 2-norm of a vector.
The Fortran version of xNRM2 avoids underflow or overflow
by scaling intermediate results, but some library versions of xNRM2
are not so careful about scaling.
If xNRM2 is implemented without scaling intermediate results, some of
the LAPACK test ratios may be unusually high, or
a floating point exception may occur in the problems scaled near
underflow or overflow.
The solution to these problems is to link the Fortran version of
xNRM2 with the test program. \emph{On some CRAY architectures, the Fortran77
version of xNRM2 should be used.}
\section{Optimization}
If a large numbers of test failures occur for a specific matrix type
or operation, it could be that there is an optimization problem with
your compiler. Thus, the user could try reducing the level of
optimization or eliminating optimization entirely for those routines
to see if the failures disappear when you rerun the tests.
%LAPACK is written in Fortran 77. Prospective users with only a
%Fortran 66 compiler will not be able to use this package.
\section{Compiling testing/timing drivers}
The testing and timing main programs (xCHKAA, xCHKEE, xTIMAA, and
xTIMEE)
allocate large amounts of local variables. Therefore, it is vitally
important that the user know if his compiler by default allocates local
variables statically or on the stack. It is not uncommon for those
compilers which place local variables on the stack to cause a stack
overflow at runtime in the testing or timing process. The user then
has two options: increase your stack size, or force all local variables
to be allocated statically.
On HPPA architectures, the
compiler and loader flag \texttt{-K} should be used when compiling these testing
and timing main programs to avoid such a stack overflow. I.e., set
\texttt{DRVOPTS = -K} in the \texttt{LAPACK/make.inc} file.
For similar reasons,
on SGI architectures, the compiler and loader flag \texttt{-static} should be
used. I.e., set \texttt{DRVOPTS = -static} in the \texttt{LAPACK/make.inc} file.
\section{IEEE arithmetic}
Some of our test matrices are scaled near overflow or underflow,
but on the Crays, problems with the arithmetic near overflow and
underflow forced us to scale by only the square root of overflow
and underflow.
The LAPACK auxiliary routine SLABAD (or DLABAD) is called to
take the square root of underflow and overflow in cases where it
could cause difficulties.
We assume we are on a Cray if $ \log_{10} (\mathrm{overflow})$
is greater than 2000
and take the square root of underflow and overflow in this case.
The test in SLABAD is as follows:
\begin{verbatim}
IF( LOG10( LARGE ).GT.2000. ) THEN
SMALL = SQRT( SMALL )
LARGE = SQRT( LARGE )
END IF
\end{verbatim}
Users of other machines with similar restrictions on the effective
range of usable numbers may have to modify this test so that the
square roots are done on their machine as well.
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?