paper.tex

来自「这是Yousef Saad编写的矩阵运算的Fortran软件包（A basic 」· TEX 代码 · 共 1,321 行 · 第 1/5 页
TEX
1,321 行
(i.e. $1/a_{ii} $ is stored in place of $a_{ii} $) because triangularsystems are often solved repeatedly with the same matrix many times,as is the case for example in preconditioned Conjugate Gradientmethods.  The column oriented analogue of the MSR format, called MSCformat, is also used in some of the other modules, but notransformation to/from it to the CSC format is necessary: for exampleto pass from CSC to MSC one can use the routine to pass from the CSRto the MSR formats, since the data structures are identical.  The  above three storage modes are used in many well-known packages.\subsubsection{The banded Linpack format (BND)} Banded matrices represent the simplest form of sparse matrices andthey often convey the easiest way of exploiting sparsity.  There aremany ways of storing a banded matrix. The one we adopted here followsthe data structure used in the Linpack banded solution routines. Ourmotivation is that one can easily take advantage of this widely available package if the matrices are banded.  For fairly small matrices (say,$N < 2000$ on supercomputers, $ N < 200 $ on fast workstations, andwith a bandwidth of $O(N^{\half} )$), this may represent a viable andsimple way of solving linear systems. One must first transform theinitial data structure into the banded Linpack format and then call theappropriate band solver. For large problems it is clear that a betteralternative would be to use a sparse solver such as MA28, whichrequires the input matrix to be in the coordinate format. %%It is%%expected that these types of utilization of the conversion routines%% will in fact be among the most common ones.In the BND format the nonzero elements of $A$ are stored in arectangular array $ABD$ with the nonzero elements of the $j$-th columnbeing stored in the $j-th$ column of $ABD$. We also need to know thenumber $ML$ of diagonals below the main diagonals and the number $MU$of diagonals above the main diagonals. Thus the bandwidth of $A$ is$ML+MU+1$ which is the minimum number of rows required in the array$ABD$. An additional integer parameter is needed to indicate which rowof $ABD$ contains the lowest diagonal.\subsubsection{The coordinate format (COO) } The coordinate format is certainly the simplest storage scheme forsparse matrices. It consists of three arrays: a real array of size$NNZ$ containing the real values of nonzero elements of $A$ in anyorder, an integer array containing their row indices and a secondinteger array containing their column indices. Note that this schemeis as general as the CSR format, but from the point of view of memoryrequirement it is not as efficient.  On the other hand it isattractive because of its simplicity and the fact that it is verycommonly used.  Incidentally, we should mention a variation to thismode which is perhaps the most economical in terms of memory usage.The modified version requires only a real array $A$ containing thereal values $a_{ij}$ along with only one integer array that containsthe integer values $ (i-1)N + j$ for each corresponding nonzeroelement $a_{ij}$.  It is clear that this is an unambiguousrepresentation of all the nonzero elements of $A$.  There are twodrawbacks to this scheme. First, it requires some integer arithmeticto extract the column and row indices of each element when they areneeded. Second, for large matrices it may lead to integer overflowbecause of the need to deal with integers which may be very large (ofthe order of $N^2$).  Because of these two drawbacks this scheme hasseldom been used in practice.\subsubsection{The diagonal format (DIA) }The matrices that arise in many applications often consist of a fewdiagonals. This structure has probably been the first one to beexploited for the purpose of improving performance ofmatrix by vector products on supercomputers, see references in\cite{Saad-Boeing}.  To storethese matrices we may store the diagonals in a rectangular array$DIAG(1:N,1:NDIAG) $ where $NDIAG$ is the number of diagonals.  Wealso need to know the offsets of each of the diagonals with respect tothe main diagonal. These will be stored in an array $IOFF(1:NDIAG)$.Thus, in position $(i,k)$ of the array $DIAG$ is located the element$a_{i,i+ioff(k)}$ of the original matrix.  The order in which thediagonals are stored in the columns of $DIAG$ is unimportant.  Notealso that all the diagonals except the main diagonal have fewer than$N$ elements, so there are positions in $DIAG$ that will not be used.In many applications there is a small number of non-empty diagonalsand this scheme is enough. In general however, it may be desirable tosupplement this data structure, e.g., by a compressed sparse rowformat.  A general matrix is therefore represented as the sum of adiagonal-structured matrix and a general sparse matrix.  Theconversion routine CSRDIA which converts from the compressedsparse row format to the diagonal format has an option	 to thiseffect.  If the user wants to convert a general sparse matrix to onewith, say, 5 diagonals, and if the input matrix has more than 5diagonals, the rest of the matrix (after extraction of the 5 desireddiagonals) will be put, if desired, into a matrix in the CSR format.In addition, the code may also compute the most important 5 diagonalsif wanted, or it can get those indicated by the user through the array$IOFF$.\subsubsection{The Ellpack-Itpack format (ELL) }The Ellpack-Itpack format\cite{Oppe-Kincaid,Young-Oppe-al,Oppe-NSPCG} is a generalization of the diagonal storage scheme which is intended forgeneral sparse matrices with a limited maximum number of nonzeros perrow. Two rectangular arrays of the same size are required, one realand one integer.  The first, $COEF$, is similar to $DIAG$ and containsthe nonzero elements of $A$. Assuming that there are at most $NDIAG$nonzero elements in each row of $A$, we can store the nonzero elementsof each row of the matrix in a row of the array $COEF(1:N,1:NDIAG)$completing the row by zeros if necessary.  Together with $COEF$ weneed to store an integer array $JCOEF(1:N,1:NDIAG)$ which contains thecolumn positions of each entry in $COEF$.\subsubsection{The Block Sparse Row format (BSR)} Block matrices are common in all areas of scientific computing. Thebest way to describe block matrices is by viewing them as sparsematrices whose nonzero entries are square dense blocks. Block matricesarise from the discretization of partial differential equations whenthere are several degrees of freedom per grid point.  There arerestrictions to this scheme. Each of the blocks is treated as a denseblock. If there are zero elements within each block they must betreated as nonzero elements with the value zero.There are several variations to the method used for storing sparsematrices with block structure. The one considered here, the BlockSparse Row format, is a simple generalization of the Compressed SparseRow format.We denote here by $NBLK$ the dimension of each block, by $NNZR$ the number of nonzero blocks in $A$ (i.e., $NNZR = NNZ/(NBLK^2) $) and by $NR$ the block dimension of $A$,(i.e., $NR = N/NBLK$), the letter $R$ standing for `reduced'. Like the Compressed Sparse Row  format we need three arrays. A rectangularreal array $A(1:NNZR,1:NBLK,1:NBLK) $ contains the nonzeroblocks listed (block)-row-wise. Associated with this real arrayis an integer array $JA(1:NNZR) $ which holds the actual column positions in the original matrix of the $(1,1)$ elements of the nonzero blocks.Finally, the pointer array $IA(1:NR+1)$ points to the beginningof each block row in $A$ and $JA$. The savings in memory and in the use of indirect addressing with  thisscheme over Compressed Sparse Row   can be substantial for largevalues of $NBLK$.\subsubsection{The Symmetric Skyline format (SSK) }A skyline matrix is often referred to as a variable bandmatrix or a profile matrix \cite{Duff-book}.  The main attraction of skyline matrices is that when pivoting isnot necessary then the skyline structure of the matrix is preserved during Gaussian elimination. If the matrix is symmetricwe only need to store its lower triangular part. This is acollection of rows whose length varies. A simple method used to storea Symmetric Skyline matrix is to place all the rows in order from1 to $N$ in a  real array $A$ and then keep an integer array which holds the pointers to the beginning of each row, see \cite{Duff-survey}.The column positions of the nonzero elements stored in $A$can easily be derived and are therefore not needed. However, there are several variations to this scheme that are commonly used iscommercial software packages. For example, we found that in many instances the pointer is to the diagonal element rather than to the first element in the row. In some cases (e.g., IBM's ISSL library)both are supported. Given that these variations are commonly used it is a good idea to provide at least a few of them. \subsubsection{The Non Symmetric Skyline format (NSK) }Conceptually, the  data structure of a nonsymmetric skyline matrix consists of two substructures. The first consists of the lower part of $A$ stored in skyline formatand the second of its upper triangular part stored in a columnoriented skyline format (i.e., the transpose is stored instandard row skyline mode). Several ways of putting thesesubstructures together may be used and there are no compelling reasons for preferring one strategy over another one. One possibility is to use two separate arrays $AL$ and $AU$ forthe lower part and upper part respectively, with the diagonal element in the upper part. The data structures for each of two parts is similar to that used for the SSK storage.%% NOT DONE YET --- %%We chose to store contiguously each row of the lower part and column of %%the upper part of the matrix.  The real array $A$ will contain %%the 1-st row followed by the first column (empty), followed%%by the second row followed by the second column, etc..%%An additional pointer is needed to indicate where the%%diagonal elements, which separate the lower from the upper part,%%are located in this array. G\subsubsection{The linked list storage format (LNK) }  This is one of the oldest data structures used for sparse matrix computations.It consists of four arrays: $A$, $JCOL$, $LINK$ and $JSTART$. The arrays $A$ and $JCOL$ contain the nonzero elements and theircorresponding column indices respectively. The integer array $LINK$ isthe usual link pointer array in linked list data structures: $LINK(k)$ points to the position of the nonzero element next to $A(k), JCOL(k)$ in the same row. Note that the order of the elements within each row is unimportant.  If $LINK(k) =0$ then there is nonext element, i.e., $A(k), JCOL(k)$ is the last element of the row.Finally, $ISTART$ points to the first element of each row in in the previous arrays. Thus, $k=ISTART(1)$ points to the first elementof the first row, in $A, ICOL$,  $ISTART(2) $ to the second element, etc..As a convention $ISTART(i) = 0$, means that the $i$-th row is empty.\subsubsection{The Jagged Diagonal format (JAD)} This storage mode is very useful for the efficient implementationof iterative methods on  parallel and vector processors\cite{Saad-Boeing}. Starting from the CSR format, the idea is to first reorder the rows of the matrix decreasingly according to their number of nonzeros entries. Then, a new data structure is built by constructing what we call ``jagged diagonals" (j-diagonals).We store as a dense vector, the vector consisting of allthe first elements in $A, JA$ from each row, together with an integer vector containing the column positions of the corresponding elements. This is followed by the second jagged  diagonal consisting of theelements in the second positions from the left. As we build more andmore of these diagonals, their length decreases.  The number ofj-diagonals is equal to the number of nonzero elements of the firstrow, i.e., to the largest number of nonzero elements per row.  Thedata structure to represent a general matrix in this form consists,before anything, of the permutation array which reorders the rows.Then the real array $A$ containing the jagged diagonals in successionand the array $JA$ of the corresponding column positions are stored,together with a pointer array $ IA $ which points to the beginning ofeach jagged diagonal in the arrays $A, JA$. The advantage of thisscheme for matrix multiplications has been illustrated in\cite{Saad-Boeing} and in \cite{Anderson-Saad} in the context of triangular system solutions.\subsubsection{The Symmetric and Unsymmetric Sparse Skyline format(SSS, USS)} This is an extension of the CSR-type format described above.In the symmetric version, the following arrays are used:$DIAG$ stores the diagonal,$AL, JAL, IAL$ stores the strict lower part in CSR format, and$AU$ stores the values of the strict upper part in CSC format.In the unsymmetric version, instead of $AU$ alone, the strict upper partis stored in $AU, JAU, IAU$ in CSC format.%% THIS SECTION HAS BEEN MODIFIED%% \subsubsection{The Compressed Variable Block Format(CVB)}%% This is an extension of the Block Sparse Row format (BSR). In the BSR%% format, all the  blocks have the same size.%% A more general way of partitioning might allow the%% matrix to be split into different size blocks. In the CVB format, an%% arbitrary partitioning of the matrix is allowed. However, the columns and%% the rows must be split in the same way. %%%% Figure 0 shows a 9x9 sparse matrix and its corresponding storage vectors.%% Let $h$ be the index of the leading elements of the $k^{th}$ block %% stored in $AA$. Then $k^{th}$ block of size $m*p$ is stored in  %% $AA(h)$ to $AA(h+mp-1)$.%% For  example, the $3^{rd}$ block is stored in $AA(9)$ to $AA(17)$.%%%% The data structure consists of the integers \(N\), \(NB\), and the arrays %% \(AA\), \(JA\), \(IA\), and%% \(KVST\), where \(N\) is the matrix size, i.e.~number of rows in the%% matrix, \(NB\) is the number of block rows, \(AA\) stores the non-zero%% values of the matrix, \(JA\) has the column indices of the first%% elements in the blocks, \(IA\) contains the pointers to the beginning%% of each block row (in \(AA\)), \(KVST\) contains the index of%% the first row in each block row.%%%% \begin{figure}[h]%% \vspace{3in}%% \special{psfile= fig1.eps vscale = 100 hscale = 100 hoffset =-50 voffset= -420}%%\caption {\bf Fig 1: An example of a 9x9 sparse matrix and its storage vectors.  }%%\label{conventional}%% \end{figure}
paper.tex - 源码说明

本页面展示了「这是Yousef Saad编写的矩阵运算的Fortran软件包（A basic tool-kit for sparse matrix computations (Version 2),包含常见的排序」中的 paper.tex 源码文件，采用 TEX 编程语言编写，共 1,321 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与computations相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?